criterion validity of the rorschach mutuality of autonomy (moa) scale: a meta-analytic review

Running Head: MUTUALITY OF AUTONOMY META-ANALYSIS

The Criterion Validity of the Rorschach Mutuality of Autonomy (MOA) Scale:

A Meta-Analytic Review

Abstract

The present study consisted of a meta-analytic review of the criterion validity of the

Rorschach Mutuality of Autonomy (MOA; Urist, 1977) scale. Search procedures yielded 27

independent samples (total N = 1,803, average n = 67, SD = 31) for inclusion in the meta-

analysis. Results support the criterion validity of the MOA with an average overall weighted

effect size of r = .24, p < .001 (95% confidence interval = .17-.30). Publication bias analyses

indicate the potential for bias and demonstrate that the likely impact of such bias would bring the

average overall weighted effect size down to r = .18, p < .001 (95% confidence interval for

adjusted effect size = .11-.24). The data were not demonstrably heterogeneous (Q = 37.29, df =

26, p = .07), and all between-study moderator analyses were nonsignificant (ps > .20) with the

exception of the specific type of criterion variable. Implications for future research and clinical

practice are discussed.

Keywords: meta-analysis, Mutuality of Autonomy scale, MOA, Urist, MAS, Rorschach Inkblot

Method, object relations, mental representations, psychodynamic theory, personality assessment

MUTUALITY OF AUTONOMY META-ANALYSIS 2

The Criterion Validity of the Rorschach Mutuality of Autonomy (MOA) Scale:

A Meta-Analytic Review

Internal mental representation is a pantheoretical construct that cuts across attachment

theory as well as cognitive-behavioral, psychodynamic, and interpersonal psychotherapy

approaches (Lukowitsky & Pincus, 2011). Mental representations of early life experiences with

significant caretakers form the basis for understanding personality development,

psychopathology, and how each individual perceives, reacts to, and organizes interpersonal

interactions (Blatt, 2008; Blatt, Auerbach, & Levy, 1997; Bowlby, 1969/1982). These

representations also help regulate emotional reactions to unpleasant stimuli (Selcuk, Zayas,

Günaydin, Hazan, & Kross, 2012). The quality of mental representations (or interpersonal

schemata) impact current relationship functioning, including quality of therapeutic alliance

(Piper et al., 1991; Ryan & Cicchetti, 1985); transference interpretation (Høglend et al., 2011);

psychotherapy attrition, continuation, and outcome (Ackerman, Hilsenroth, Clemence,

Weatherill, & Fowler, 2000; Blatt, Wiseman, Prince-Gibson, & Gatt, 1991; Hilsenroth, Handler,

Toman, & Padawer, 1998); and DSM-IV Global Assessment of Relationship Functioning

(GARF; APA, 1994) scores among psychiatric inpatients (Fowler et al., 2004).

The construct of internal representations—and more broadly object relations theory as a

whole—constitutes a crucial element in psychoanalytic theory and research as an explanatory

model for mental structures, cognitive capacities, and ego strengths. Freud laid the initial

foundation for object relations theory within his instinctual model of development. Specifically,

Freud (1905/1957) viewed relationships (or the consequence of object choices) as sources of

gratification for specific needs or instincts. Freud (1914/1957) examined the intrinsic motivation

for, and quality of, object relations relative to past experiences with significant caretakers,


particularly the early mother-child dyad, as well as its decisive role in determining subsequent

ego development and identity formation. The contributions of Kohut and Kernberg in

understanding borderline and narcissistic conditions of psychopathology further broadened the

scope of developmental theory, contributing to the relational turn in psychoanalytic theory

(Greenberg & Mitchell, 1983).

Kohut (1966, 1971) proposed that during the years of pre-Oedipal development, children

tend to experience others as undifferentiated selfobjects whose function is primarily to serve the

child’s mirroring, idealizing, and twinship needs. Over time, healthy development is marked by

the achievement of a cohesive-self, a psychic apparatus that integrates ambitions, ideals, and

shared self-representations. Along somewhat similar lines—although certainly not identical—

Kernberg (1966, 1970, 1975) maintained that the internalization of object relationships

represents a crucial organizing factor for both ego and superego development. Introjections,

identifications, and ego identity formation represent a developmental progression from

undifferentiated representations of the self and other to more maturely developed, whole object

representations integrating both positive and negative attributes of the self and other. Object

representations not only provide important information about the developmental level of

personality organization and the quality of relational experience to which an individual has been

exposed, but they also reflect the particular drives and enduring cognitive-affective processes

that color experience and shape an individual’s understanding of reality (Blatt & Lerner, 1983).

Mayman (1967, 1968), Urist (1977), and Lerner (1986) incorporated object relations

theory into psychological testing. Each investigator focused on the heuristic value of the

phenomenological and experiential aspects of implicit relationship schemas as they are played

out in the individual’s current life and in the treatment relationship. This emphasis constituted a


significant shift away from psychological testing to determine a patient’s analyzability to focus

on the assessment of specific conflicts around autonomy, basic trust, and fears of merger that can

not only interfere with the establishment of a therapeutic alliance, but also allow therapists to

anticipate potential therapeutic stalemates and malignant transference enactments not readily

discernible at the outset of the treatment (Gorney & Weinstock, 1980; Lerner, 1986). According

to Lerner (1986),

Patients with severe character pathology typically present a contradictory, vexing, and misleading picture to the clinician. In treatment, adverse clinical occurrences such as premature terminations, turbulent transference-countertransference struggles, negative therapeutic reactions, and treatment stalemates are more the rule than the exception (p. 132).

The analysis of the dynamics and affective quality of these object representations embedded in

Rorschach response profiles form a crucial element of psychological assessment particularly for

individuals with varying degrees of personality pathology. If these patterns of interrelatedness

are understood early in the treatment process, clinicians can be better prepared to identify

transference distortions and treatment roadblocks, as well as formulate appropriate interventions.

The Rorschach Inkblot Method (Rorschach, 1921/1951) is particularly useful for

assessing differential development of mental representations and psychopathology that has an

extensive psychoanalytic theoretical and empirical evidence-base (Allison, Blatt, & Zimet, 1968;

Aronow & Reznikoff, 1976; Aronow, Reznikoff, & Moreland, 1994; Holt, 2009; Kwawer,

Sugarman, Lerner, & Lerner, 1980; Kissen, 1986; Kleiger, 1999; Leichtman, 1996; Lerner &

Lerner, 1988; Lerner, 1991, 1992, 1996, 1998; Masling, 1983; Rapaport, Gill, & Schafer, 1945-

1946, 1968; Schachtel, 1966; Schafer, 1948, 1954, 1967; Silverstein, 1999; Weiner, 1966). As a

performance-based measure of personality assessment, thoughts and images attributed to the

Rorschach are shaped by the organizing characteristics of the individual’s underlying inner


representational world (Blatt & Lerner, 1983). These representations provide important clinical

information regarding patients’ construction of reality and the quality of their cognitive-affective

structures through which their perceptual experience is transformed into subjective meaning. As

such, object representations serve as cognitive-affective templates that mediate internal demands

and external behaviors, particularly within the context of interpersonal relationships.

The Mutuality of Autonomy (MOA; Urist, 1977; Urist & Shill, 1982) scale for the

Rorschach is the most widely utilized scoring system to measure quality and structure of implicit

mental representations (Fowler & Erdberg, 2006). The MOA is relatively quick and easy to

score, requiring no more than 10 to 15 minutes for an average length protocol. Results from a

meta-analysis (Bombel, 2006) examining response and protocol level MOA data indicated

excellent inter-rater reliability. A recent review of the construct validity of the MOA (Bombel,

Mihura, & Meyer, 2009) suggested that the MOA is a strong measure of both the quality of

object representations as well as psychopathology, irrespective of the particular MOA summary

score utilized. The principal utility of the measure is its idiographic value in anticipating the

quality of interpersonal interactions with clinicians and those closest to the patient. As such, it is

viewed as an object relations scale with specific clinical applications. With over 30 years of

research examining construct and criterion validity of the measure, it is timely to conduct a meta-

analysis of the measure’s criterion validity across multiple domains and to determine which

summary scores are most psychometrically sound and clinically useful.

Mutuality of Autonomy Scale and Summary Variables

The MOA is particularly well-suited for the assessment of the dynamic interplay between

self and other representations as well as the affective quality of these relationships. The MOA

was designed to assess thematic content of relationships (stated or implied) between animal,


inanimate, or human perceptions on the Rorschach. Unlike most Rorschach indices, the MOA is

a 7-point Likert-type that yields multiple summary scores with different interpretive meanings

(see Appendix A for detailed scoring instructions). Particular scale points reflect

“developmentally significant gradations in [an] individual’s capacity to experience self and other

as mutually autonomous within relationships” (Urist, 1977, p. 4). Specifically, these scale scores

range from indicating healthy and benevolent relationships (MOA scale scores 1-2) to unhealthy

and increasingly malevolent relationships lacking distinct boundaries and differentiation between

self and other (MOA scale scores 5-7). MOA summary scores included in the present meta-

analysis include: MOA mean (i.e., the prototypical representation); MOA low (i.e., the single

healthiest, most adaptive, representation in the protocol); MOA high (i.e., the single most

pathological representation in the protocol); and MOA pathological score (i.e., the total of all

scale points 5, 6 and 7 that occur in a given protocol). In the present study, an additional

category of the MOA is analyzed, MOA Other, which is an assortment of all other MOA scale

scores identified in the empirical literature. These scores include: MOA health index (Fowler et

al., 2004); MOA 10-point Likert-type scale (Spear & Sugarman, 1984); MOA average of the

eight highest and eight lowest scores (Kavanagh, 1985); MOA average of the three highest and

three lowest scores (Urist, 1977; Urist & Shill, 1982); MOA overall (i.e., the rater’s judgment of

the most representative score; Urist, 1977; Urist & Shill, 1982); MOA median (Blatt, Ford,

Berman, Cook, & Meyer, 1988); MOA most adaptive (i.e., frequency of MOA scores 1 and 2;

Gerard, Jobes, Cimbolic, Ritzler, & Montana, 2003); MOA narcissistic responses (i.e., frequency

of MOA scores 3 and 4; Gerard, Jobes, Cimbolic, Ritzler, & Montana, 2003); proportion of

pathological scores to total scores (Mihura, Nathan-Montano, & Alperin, 2003); frequency of

https://www.researchgate.net/publication/232446807_Dimensions_of_internalized_object_relations_in_borderline_and_schizophrenic_patients?el=1_x_8&enrichId=rgreq-ce566513-4670-4672-bcf6-da734bc51724&enrichSource=Y292ZXJQYWdlOzI2MzkzNzYzNztBUzoxMzk5MzI5NzA1MjQ2NzJAMTQxMDM3NDAyODAyMw==


MOA individual scale points 1 through 7 (Goddard & Tuber, 1989); and all MOA scale scores

that could not be more specifically identified from a given study (e.g., Brown-Cheatham, 1993).

Consistent with previous meta-analytic validity findings for the Rorschach (Hiller,

Rosenthal, Bornstein, Berry, & Brunell-Neuleib, 1999), we predicted for the present study that

findings would support the criterion validity of the MOA with a medium overall weighted effect

size. Second, we predicted that the MOA mean score would have the largest validity effect size

since it is based on all MOA data garnered from the Rorschach protocol, thus making it

psychometrically more robust. We predicted that the MOA pathological summary score would

have the next highest degree of validity given that it incorporates data from three levels of MOA

scores, in contrast to MOA highest that consists of only a single MOA score. Finally, although

we excluded studies that utilized MOA data as part of the determination of the criterion variable

(e.g., diagnosis; see Method section for further details) in order to avoid criterion contamination,

we did include studies in which the rater of the MOA data could have been aware of the criterion

variable (e.g., the study did not report whether or not the MOA rater was blind to the criterion

variable). To measure the potential impact of this methodological issue, we coded each study for

the degree of blindness of the MOA rater and examined whether this variable was potentially

associated with the size of the validity coefficient. All other analyses were considered

exploratory in nature.

Method

Literature Search

Four steps were utilized in our search of the empirical literature to find individual studies

for inclusion in the meta-analysis. First, we performed a series of PsycINFO and MEDLINE

electronic database searches from 1977 (when the MOA was first introduced) through July 29,


2010 when the final search was conducted. The following terms were used in our search:

“Mutuality of Autonomy,” “MOA,” “Urist and Rorschach,” “Object relations and Rorschach,”

“Mental representations and Rorschach,” “MOA and Rorschach,” or “Rorschach and MAS.”

Titles and abstracts of citations were reviewed and all potentially relevant publications were

retrieved and examined in full text. Second, we performed a manual search of the Journal of

Personality Assessment (JPA) from the first issue in 1977 through volume 93, issue 3 in 2011

(i.e.., the May-June 2011 issue). The manual search of JPA focused on publications that related

to either the Rorschach generally or the MOA specifically, with relevance for the present meta-

analysis determined by examining the titles and abstracts, and followed by a full article review in

cases of uncertainty. Next, we also used several review articles (Huprich & Greenberg, 2003;

Stricker & Healey, 1990) and book chapters (Lerner, 2006; Stricker & Gooen-Piels, 2004) to

locate relevant references. Lastly, we conducted a backwards check of the reference sections of

each relevant publication retrieved in the first three procedures to locate any additional

references for inclusion.

The following criteria were utilized to determine eligibility for inclusion in the present

meta-analysis: (a) the study must include some proximal measure that assesses criterion validity

of the MOA (e.g., correlations with GAF scores, diagnostic behavioral group differences,

psychotherapy outcome change data, self-reported ratings of intrapsychic or interpersonal

functioning, etc.); studies assessing construct validity by correlating the MOA to distal measures

of object relations (e.g., the Rorschach or TAT) do not meet study requirements; (b) studies that

contrast diagnostic group differences on the MOA must derive diagnostic conclusions

independently of performance-based test data to avoid criterion contamination (e.g., Berg,

Packer, & Nunno, 1993, utilized diagnoses of “narcissistic,” “borderline,” and “schizophrenic”


that were assigned based on data that included the Rorschach; this study was therefore excluded);

(c) the study had to report sufficient information in order to allow for calculation or estimation of

the effect size; (d) the study could be either experimental or naturalistic in its design, as long as it

included a sample of 20 or greater to ensure adequate statistical power; (e) the study had to be

published (dissertations, unpublished manuscripts, or conference papers were excluded); and (f)

the study had to be written in English or translations of the original made available by the

authors.

Data Abstraction

Effect sizes indicate the magnitude of the relation between two variables (Rosenthal &

Rosnow, 2007), in this case the relation between MOA scores and various criterion variables.

These statistics are coded as absolute values, and their directions (i.e., positive or negative) are

then determined by their relation with the a priori hypotheses of a given meta-analysis. All

effect sizes, therefore, were assigned a positive value if they were consistent with our a priori

predictions (e.g., if the MOA was positively correlated with number of psychotherapy sessions)

or a negative value if they were inconsistent with our a priori predictions (e.g., if the MOA

high—i.e., the most pathological MOA score in a protocol—was negatively correlated with total

number of diagnostic criteria met for antisocial personality disorder). These a priori predictions

were established by the authors of the present meta-analytic study based on theoretical

considerations and results of previous empirical studies (cf., Hiller et al., 1999).

To correct for the fact that correlation coefficients—the effect size metric used in the

present study—have a skewed distribution, all coefficients were first transformed into the log-

based statistic Fisher’s Z of r (Borenstein, Hedges, Higgins, & Rothstein, 2009). The next step

involved assigning weights to each coefficent before they could be aggregated. In general,


coefficients that are based on larger sample sizes have greater precision than those based on

smaller sample sizes. To account for these differences in precision, each coefficient was

weighted by its inverse variance, a statistic that is heavily based on the sample size associated

with a given coefficient. Next, these weighted coefficients were averaged. Finally, the results

were then transformed back from Fisher’s Z of r into r itself, all following standard meta-analytic

procedures (e.g., Borenstein et al., 2009; Lipsey & Wilson, 2001).

If a particular study reported analyses that were relevant to the present meta-analysis but

stated only that results were nonsignificant (without providing data that would ordinarily permit

calculation of an effect size), the effect size for this study was entered as r = .00 to be

conservative following standard meta-analytic convention (Horvath & Symonds, 1991; Martin,

Garske, & Davis, 2000). The statistical assumptions for meta-analysis require that only a single

effect size and p value for each study can be included in meta-analytic calculations. When

studies reported multiple effect sizes, then, these effect sizes were averaged, once more

following standard meta-analytic convention (Horvath & Symonds, 1991; Martin et al., 2000).

As before, calculations were conducted using Fisher’s Zr transformation. Results were then

transformed back into r.

Most of the time, studies presented their data in ways that allowed for relatively

straightforward calculation of effect sizes using formulas found in standard meta-analytic texts

(e.g., Lipsey & Wilson, 2001). There were some instances, however, in which more

sophisticated calculations were required. Thus, for example, some studies reported contrasts

between several groups on MOA scores in which the different groups formed a contiuum of

severity (e.g., normal controls, neurotic level patients, borderline level patients, and psychotic

level patients; Hibbard, Porcerelli, Kamoo, Schwartz, & Abell, 2010). In such cases, effect sizes


were calculated using calculations outlined in Meyer, McGrath, and Rosenthal (2003). This

procedure, however, only yields an absolute value for the effect size. In order to obtain the

direction for this effect, we used a specific correlational analysis. That is, we analyzed the

direction of the correlation obtained when the scores of each participant (assigned as the mean

score for his/her particular group on the severity continuum) were correlated with numerical

weights (these were assigned using a priori hypotheses that specified which groups would fall in

particular places along the severity continuum; R. Rosenthal, personal communication,

December 22, 2010).

Several methodological guidelines were developed to facilitate coding decisions for the

present meta-analysis. When effect sizes were presented in a study indicating relationships

between both (a) MOA and total scale scores for a given criterion variable, as well as (b) MOA

and subscale scores (or individual score items) for a given criterion variable, only the

relationship between MOA and total scale scores were included in the present study.

Additionally, when relevant data were presented for the total sample of participants and then

further broken down into subgroups whose differences were not relevant to our a priori

hypotheses (e.g., male and female data presented separately), only the relevant data for the total

sample were utilized. Also, effect size data in a study based on only the frequency of total

responses that were coded for the MOA (e.g., Tuber & Coates, 1989) were not included since

these data are not typically utilized by either researchers or clinicians as assessing object

representations. Finally, when a study reported test data for an inherently continuous criterion

variable (e.g., depression) in both continuous as well as categorical forms (e.g., severely

depressed versus not severely depressed), only the continuous data were included in the present

meta-analysis.


Beyond just aggregating the validity effect sizes across different studies, we also sought to

investigate the relation between these validity effect sizes and several moderator variables. The

first author (XXX) therefore coded each study for the following moderators: quality of inter-rater

reliability for the MOA; sample level number of subjects; referenced MOA scoring method;

publication type; degree of blinding of the MOA scorer to the criterion variable(s); Rorschach

system method of administration; severity of pathology; subject age; sample type; race; gender;

criterion variable source; MOA scale score utilized; absolute value of the effect size; sample size

for the effect size; and sign of the effect size.1 When conducting analyses for categorical

moderators, we pooled the values for weights across all of the categorical subgroups. This

procedure was done only when a subgroup for a categorical moderator included less than six

studies (cf., Borenstein et al., 2009). This decision was based on the rationale that pooling the

relevant weights is likely to yield more accurate results than calculating separate weights for the

different subgroups in such circumstances (Borenstein et al., 2009).

We also sought to investigate how reliably each of the aforementioned codes could be

rated. A random subset of ten independent studies was therefore selected from the final sample

of studies included in the meta-analysis. The first (XXX) and fourth (XXX) authors

independently coded each study for (a) all effect size calculations, (b) sample sizes, and (c) all

moderator variables. Different statistical analyses are required when evaluating reliability for

ratings of continuous versus categorical variables. As a result, calculations of intraclass

correlation coefficients (ICCs) were conducted to evaluate inter-rater reliability for continuous

variables, whereas Kappa statistics were calculated to evaluate inter-rater reliability of

categorical variables (Orwin & Vevea, 2009). Since often, however, there were multiple effect

1 The full coding scheme for sample and effect size level codings is found in Appendix B.


sizes reported in individual studies, ICCs for effect sizes and their sample sizes were calculated

based on the average effect size and average sample size for each independent sample.

Although ICC and Kappa calculations (both of which correct for chance agreement on ratings)

were sufficient in most instances, there were some cases that required a different method for

evaluating interrater reliability. Specifically, results of ratings for some variables had

distributions that made calculation of ICC or Kappa statistically impossible. In these instances,

therefore, we calculated a simple statistic for interrater reliability, i.e., percent agreement..

Finally, in cases where the two raters disagreed on a given code, the final coding was determined

by discussing the code until a consensus was reached.

Quantitative Data Synthesis

A number of methods are available to aggregate effect sizes across studies in order to

obtain an overall weighted average. For the present meta-analysis, effect sizes were aggregated

using the random effects method of Hedges and colleagues (Hedges & Vevea, 1998). Random

effects methods are considered to be more representative of real-world data in contrast to the

alternative fixed effect approach (National Research Council, 1992), and they provide a more

conservative estimate of the average weighted effect size (Field, 2001, 2005).

In the present study, only weighted averages of the overall effect size are presented, rather

than unweighted averages (i.e., a simple calculation of the arithmetic mean of effect sizes across

all studies). Meyer and Archer (2001) argued that weighted coefficients are superior to

unweighted ones in meta-analyses that investigate relationships between predictors and their

criterion variables (which is the case in the present meta-analysis).


Calculations of effect sizes for each study were conducted using an Excel file developed

for this purpose (Diener, 2009).2 Effect size and moderator data were then entered into version 2

of Comprehensive Meta-Analysis (CMA; Borenstein, Hedges, Higgins, & Rothstein, 2005)

software for meta-analytic calculations. All reported p values in the present study are two-tailed

unless otherwise noted.

Rosenthal and DiMatteo (2001) argued for the importance of examing the potential role of

moderators. Examination of these moderators draws attention to hypotheses that may have

otherwise been neglected by authors of any given individual study (Rosenthal, Hiller, Bornstein,

Berry, & Brunell-Neuleib, 2001). In the present meta-analysis, moderator analyses for both

categorical and continuous variables were conducted.

Categorical moderator analyses were conducted using Q tests as an analog to analysis of

variance in primary research; these tests help determine whether the various levels of the

moderator variable (e.g., different types of MOA scores) differed significantly from each other in

the size of their validity coefficients (Borenstein et al., 2009; Lipsey & Wilson, 2001). When

this omnibus Q test was statistically significant or when specific a priori hypotheses were

postulated (i.e., differences in criterion validity by type of MOA scale score, as described above),

post-hoc comparisons were conducted using a Z-test (Borenstein et al., 2009) to determine

specific differences.

Continuous moderator analyses were conducted using mixed effects (method of moments)

meta-regression analyses which examined the relation between effect sizes for each study

(averaged across individual effects presented in a given study) and the different continuous

moderator variables (Borenstein et al., 2009). Since CMA software will not conduct a multiple

2 This file is freely available for download at https://sites.google.com/site/drmarcjdiener/statistical-calculators .

https://sites.google.com/site/drmarcjdiener/statistical-calculators

https://sites.google.com/site/drmarcjdiener/statistical-calculators


meta-regression analysis, each continuous moderator variable was examined utilizing a separate

meta-regression.

Publication Bias

Some critics have argued against the validity of meta-analyses generally, claiming that

meta-analyses include only studies that have demonstrated positive findings, whereas negative

findings get relegated to their experimenters’ file drawers and never enter the meta-analyst’s

purview. This possibility, they argued, calls into question the result of meta-analyses, which are

perforce based on a biased sample of research studies (Meyer et al., 2003; Rosenthal, 1991). To

address this argument, a series of analyses were conducted to examine the potential for such

publication bias: (1) Sterne’s funnel plot display analysis (Sterne & Egger, 2001; Sterne &

Harbord, 2004); (2) Begg and Mazumdar’s (1994) rank correlation; (3) Egger’s regression

intercept (Egger, Davey Smith, Schneider, & Minder, 1997); and (4) Duval and Tweedie’s

(2000a, 2000b) trim and fill procedure.

Results

Study Flow

Figure 1 contains a chart detailing the flow of studies through the present meta-analysis.

The PsycINFO and MEDLINE electronic searches identified 767 abstracts for potential

inclusion. The manual JPA search, review of relevant articles and book chapters, and backwards

reference checks yielded an additional 101 publications for potential inclusion. In total, 868

records were screened. Six-hundred and eight studies did not meet criteria for inclusion and

were immediately excluded. The remaining 260 studies were retrieved in full text and examined


for eligibility. Two-hundred and twenty-two studies were excluded3 and 38 remained for

inclusion (11 of which were aggregated due to instances of identical or overlapping samples).

Twenty-seven independent studies were utilized in the final meta-analysis.

Study Characteristics

Twenty-seven independent samples were included in the meta-analysis. The meta-analytic

sample consisted of a total of 1,803 participants4 with a mean number of 67 participants per

sample and a standard deviation of 31. Relevant information from the included studies is

summarized in Tables 1 and 2. Table 1 contains the overall effect sizes and Table 2 contains the

moderator codings for each independent sample.

Inter-rater Reliability

Table 3 presents the inter-rater reliability data for the coding of MOA effect sizes, sample

sizes, and moderator variables. According to conventional guidelines (Cicchetti, 1994; Fleiss,

1981), ICCs or Kappas ranging below .40 are considered to demonstrate poor reliability beyond

chance, whereas values between .40 and .59 are considered fair, between .60 and .74 are

considered good, and between .75 and 1.00 are considered excellent. Results in Table 3 indicate

that all variables were rated with excellent reliability with two exceptions: the moderator variable

“Severity of pathology” was rated with good reliability, and the variable “Rorschach system

method utilized for administration” was rated with reliability falling short of the benchmark for

fair (see note to Table 3 for further details). Percent agreement guidelines (Hilsenroth &

Charnas, 2007; Weiner, 1991) established for Rorschach training and research point to a 3Several additional studies were excluded from the present meta-analysis that would have otherwise been included because identical effect size data were reported elsewhere in prior studies. For example, effect size data in the studies conducted by Blatt and colleagues (1988, 1990) were included in the meta-analysis. Identical effect size data were also presented in several additional studies (i.e., Cramer & Blatt, 1990; Cramer, Blatt, & Ford, 1988; Fertuck, Bucci, Blatt, & Ford, 2004) that were therefore excluded from the meta-analysis. 4 When discussing numbers of participants in the current meta-analysis, these numbers are rounded to the nearest whole number to facilitate ease of comprehension.


minimum criterion of 80% agreement. Results in Table 3 indicate that all variables in which

percent agreement calculations were conducted demonstrated reliability greater than this

benchmark.

Quantitative Data Synthesis

Overall effect size.

As predicted, findings support the criterion validity of the MOA with an average overall

weighted effect size of r = .24, p < .001 (95% confidence interval = .18-.29). Utilizing Cohen’s

(1988) standard benchmarks, this effect would fall between a small and medium effect size.

Hemphill’s (2003) empirically derived guidelines5 for interpreting the magnitude of assessment

and treatment findings further suggest an effect in the middle third of overall effect sizes of

psychological assessment and treatment. Results of the homogeneity test (Q [26] = 37.67, p

[one-tailed] = .07) indicated a lack of demonstrable evidence of heterogeneity across the effects

sizes in the meta-analysis. The percentage of total variation observed (I2 = 30.99) attributed to

real differences in effect size rather than random variation is considered low (Higgins,

Thompson, Deeks, & Altman, 2003). Despite the aforementioned nonsignificant findings for

heterogeneity, following the recommendations of Rosenthal and DiMatteo (2001), we present the

results of several moderator analyses to test our a priori hypotheses and to aid future researchers

in further clarifying the results.

Publication bias.

Figure 2 presents a funnel plot with data derived from the meta-analysis together with

imputed data to correct for potential publication bias utilizing the trim and fill procedure outlined

5 Guidelines are based on the distribution of correlation coefficients found among 380 meta-analytic studies (78 studies regarding psychological assessment and 302 concerning psychological treatment). Correlation coefficients less than .20 represent the lower third, those between .20 and .30 represent the middle third, and those above .30 represent the upper third.


by Duval and Tweedie (2000a, 2000b). An “eye-ball” test of the funnel plot does seem to

indicate the potential for publication bias, given the asymmetrical appearance of the funnel plot

with a gap in the left bottom side of the graph (where studies with smaller sample sizes, and

therefore greater error, would have effect sizes in the opposite direction of the a priori

hypotheses). In the absence of publication bias, we would expect the plot to resemble a

symmetrical, inverted funnel (Sterne & Egger, 2001; Sterne & Harbord, 2004). Results of Begg

and Mazumdar’s (1994) rank correlation (τ = 0.35, p [one-tailed] < .01) and Egger’s (Egger et

al., 1997) regression intercept (intercept = 2.12, p [one-tailed] < .05) are consistent with these

aforementioned findings and indicate the potential for bias. These analyses, though, simply

indicate whether or not there is the potential for bias. They do not, however, indicate the likely

impact of such bias (Borenstein et al., 2009). Using Duval and Tweedie’s (2000a, 2000b) trim

and fill procedure, we found nine studies that would need to be trimmed if publication bias did

influence the meta-analytic results. These results indicate that there is some difference between

the observed overall effect size for the present study and the effect size which is adjusted to

control for potential publication bias (see Figure 2). The likely impact of this potential bias

would be to bring the average overall weighted effect size down from .24 to .18 (significance of

the adjusted effect size, p < .001; 95% confidence interval for adjusted effect size = .11-.24),

indicating a 25% decrease in effect size.

Moderator analyses.

Moderator analyses were separated into categorical and continuous moderators. Table 4

presents results of the categorical subgroup analyses that compared the degree of criterion

validity across different MOA scale scores. Although the MOA mean and MOA pathological

summary scores seem to demonstrate the most robust effects, the difference between MOA scale


score effect sizes was not statistically significant (Qbetween = 1.91, df = 4, p = .75). Nevertheless,

post hoc comparisons were conducted since specific a priori hypotheses were generated for this

moderator variable. These post hoc results, however, remained nonsignificant (all Zs < 1.30; all

ps > .19).

Categorical subgroup analyses were conducted to compare the MOA’s criterion validity

based on the type of specific criterion variables6 utilized in individual studies (see Table 5). The

difference between specific MOA criterion variable effect sizes was statistically significant

(Qbetween = 21.08, df = 9, p = .01), suggesting differences in the MOA’s validity depending on the

type of criterion utilized. Post hoc analyses demonstrated that the validity of the MOA was

significantly greater when the criterion variable utilized was the MOA’s ability to distinguish

clinical from nonclinical samples relative both to when the criterion variable was intrapsychic

functioning (Z = 1.96, p = .05) as well as when the criterion variable was interpersonal

functioning (Z = 2.75, p = .006). In addition, the validity of the MOA was significantly greater

when the criterion variable was independent ratings of the participants’ mutuality of autonomy

relative to each of the following criterion variables: (a) diagnostic group differences (Z = 2.91, p

= .004), (b) behavioral criteria groupings (Z = 2.45, p = .01), (c) patient outcome change data (Z

= 2.70, p = .007), (d) level of symptomatology or overall functioning (Z = 2.22, p = .03), (e) level

of intrapsychic functioning (Z = 2.97, p = .003), (f) level of interpersonal functioning (Z = 3.53, p

= .0004), and (g) academic performance (Z = 2.03, p = .04). Validity of the MOA was

significantly greater when the criterion variable was discrete behavioral markers relative to when

the criterion variable was level of interpersonal functioning (Z = 2.15, p = .03). Finally, validity

of the MOA was significantly greater when the criterion variable was level of symptomatology

or overall functioning compared to level of interpersonal functioning (Z = 2.19, p = .03). 6 Descriptions of the full coding scheme for the specific criterion variables are found in Appendix B.


All remaining categorical subgroup moderator analyses were not statistically significant.

These included Rorschach administration method (Qbetween = 2.12, df = 2, p = .34), sample type

(Qbetween = .14, df = 2, p = .93), and referenced MOA scoring method (Qbetween = 2.87, df = 2, p =

.24). Additionally, none of the continuous moderator analyses was statistically significant.

These results included the meta-regressions for quality of inter-rater reliability (slope = -.08, p =

.31), degree of blinding to the criterion variable (slope = -.04, p = .24), age (slope = -.001, p =

.76), gender (slope = -.001, p = .54), race (slope = .000, p = .79), and severity of pathology (slope

= .001, p = .98).

Discussion

Hypothesis 1: Average Overall Weighted Effect Size

As predicted, the overall validity for the MOA found in the present study (weighted r =

.24) resembles the magnitude of validity for both the Rorschach (weighted r = .26) and the

MMPI (weighted r = .37) generally, based on results from the comparative meta-analysis

performed by Hiller and colleagues (1999). The validity results of the present study also

compare favorably with the most recent meta-analysis conducted across 53 Rorschach

Comprehensive System variables in which the mean overall validity coefficient for these

variables was r = .27 when evaluated using externally-assessed criteria (e.g., psychiatric

diagnosis, observer ratings; Mihura, Meyer, Dumitrascu, & Bombel, 2012). The results for

criterion validity of the MOA constitute a medium effect size using Hemphill’s (2003)

benchmarks or a small-to-medium effect using the guidelines of Cohen (1988). These results are

highly statistically significant and cannot be attributed to chance. These findings further suggest

the overall utility of the MOA as a performance-based measure of object relations and

psychopathology in the context of research findings for other personality assessment measures.


Hypothesis 2: Criterion Validity Effect Size for the MOA Mean Scale Score

Although we predicted that the MOA mean score would demonstrate significantly greater

validity than other MOA scores, results demonstrated no statistical significant differences.

Nevertheless, the MOA mean score did demonstrate the most robust overall effect size across a

variety of MOA scale scores (r = .30, p < .001, 95% confidence interval = .22-.37). The mean

MOA score is understood to reflect an individual’s prototypical quality of interpersonal

relatedness, and it indicates an individual’s range or repertoire of adaptive and pathological

representations (Blatt, Tuber, & Auerbach, 1990). It is the most frequently reported MOA scale

score utilized by studies included in the present meta-analysis, and it has been found to be

particularly useful at predicting psychopathology (Bombel et al., 2009). Further, the MOA mean

score was the most effective predictor of interpersonal behavior and clinical functioning at the

commencement of psychotherapy, in addition to predicting the intensity and frequency of

clinical symptoms at outcome, relative to all other MOA scale scores (Cook, Blatt, & Ford,

1995).

Hypothesis 3: Criterion Validity Effect Size for the MOA Path Scale Score

Although we predicted that the MOA pathological score would demonstrate significantly

greater validity than other MOA scores (with the exception of the MOA mean), there were no

statistically significant differences between the effect sizes for the various MOA scores.

Nevertheless, the MOA pathological score did yield the next highest criterion validity effect size

(weighted r = .27, p < .001, 95% confidence interval = .10-.42) for MOA scale scores after the

MOA mean. As a composite of the most malevolent and aggressive responses present in a

Rorschach protocol, the MOA pathological score has been found to be a robust and stable

measure of pathological object relations (Fowler, Hilsenroth, & Nolan, 2000). The MOA


pathological score is particularly successful at discriminating between clinical and nonclinical

populations. For example, the MOA pathological score has demonstrated effectiveness in

distinguishing bulimic patients (Fowler, Brunnschweiler, & Brock, 2002)—as well as borderline

patients with and without self-mutilative behaviors (Baity, Blais, Hilsenroth, Fowler, & Padawer,

2009)—from nonclinical controls. The MOA pathological score has also successfully

discriminated between individuals with borderline pathology and those diagnosed exclusively

with Axis I conditions (Zodan, Charnas, & Hilsenroth, 2009). In addition, the MOA

pathological score has been positively related to the number of psychotherapy sessions attended

by patients (Ackerman et al., 2000).

Hypothesis 4: Potential Impact of Criterion Contamination

We tested the potential impact of criterion contamination by examining the relation

between the degree of blinding by the MOA rater (to the particular criterion variables

investigated in the original studies) and associated effect sizes across studies. If criterion

contamination presented a threat to the MOA’s criterion validity, then there should have been a

negative relationship between the degree of blinding by the MOA rater and associated effect

sizes. Findings from the present study indicated no significant relation between these two

variables (slope = -.04, p = .24), supporting the validity of the MOA across a variety of

methodological contexts.

Directions for Future Research

A limitation of the present study includes the fact that only published studies were included

in the meta-analysis. This inclusion criterion may have biased the overall results given that

stronger, more significant findings tend to find their way into the published literature more often

than less significant ones, creating a potential “file drawer” effect (Borenstein et al., 2009).


Publication bias analyses indicated the potential for selection bias, although results demonstrated

that the findings of the present meta-analysis cannot be attributed solely, or even primarily, to

such potential bias. Future studies may benefit from including unpublished studies and

dissertations. Rosenthal (1991), however, argues that even though these unpublished studies

tend to yield smaller effects (by about 1/5 of a standard deviation), published and unpublished

studies are nearly identical. Nevertheless, future research which includes unpublished data and

increases the number of studies for the meta-analysis (thus increasing the statistical power of the

moderator analyses) would strengthen and extend the present findings.

Given the nonsignificant differences found in the present study between the validity of

different MOA scale scores, and the fact that very few studies utilized MOA pathological and

health index scores, it is noteworthy that the Rorschach Performance Assessment System (R-

PAS; Meyer, Viglione, Mihura, Erard, & Erdberg, 2011) includes only the MOA PATH and

HEALTH scores. Further, although the differences between the effect sizes for the various

MOA scores were not statistically significant, the general pattern of results from the present

study are consistent with previous construct validity findings for the MOA (Bombel et al., 2009),

and they suggest that the MOA mean is perhaps better at capturing object relations phenomena

and psychopathology in comparison to the MOA pathological score. As future studies amass, it

will be important to see how adding the MOA PATH and MOA HEALTH, instead of other

MOA scale scores (e.g., MOA mean), to the structural summary can provide additional

information to enrich clinical interpretation and diagnostic clarification.

The present study also offers researchers useful information for hypothesis generation and

testing. More specifically, by delineating which criterion variables the MOA is best at capturing

(e.g., diagnostic group differences, ratings of interpersonal functioning, and psychopathology),


future researchers are better positioned to design and interpret studies of performance-based

measures of personality functioning. Further examination of the scale’s incremental validity

would also provide important information about the scale’s utility relative to other psychological

assessment measures.

Clinical Implications of the Results

Psychoanalytic theory provides a valuable organizing framework for enhancing personality

assessment (Bornstein, 2010). Applying psychoanalytic approaches to psychological testing

provides clinicians with a theoretical basis for elucidating the dynamic processes that underlie

performance-based measures of assessment (Jaffe, 1990, 1992; Sugarman, 1991; Sugarman &

Kanner, 2000). Results from the present study indicate that MOA scale scores are significantly

related to theoretically relevant criterion variables. The MOA, for instance, is successful at

picking up group differences on the MOA between clinical and nonclinical populations,

diagnostic group differences, and differences between groupings based on behavioral criteria

(e.g., self-mutilators versus non-self-mutilators). The MOA is also successful at discerning

discrete behavioral markers (e.g., number of psychotherapy sessions attended), psychotherapy

outcome change data, level of symptomatology (e.g., GAF or GARF scores), observer-rater

levels of mutuality of autonomy (based on ward behavior, chart review, and autobiographical

writings), and observer-rated levels of interpersonal and intrapsychic functioning (e.g., quality of

object relations and motivation for treatment). These findings represent a particularly high

standard of validity for a measure, given that many scales simply report validity data in terms of

distinguishing clinical from nonclinical samples. In addition, findings from this meta-analysis

suggest that the MOA may be particularly useful in capturing the object relations construct


originally conceptualized by Urist (1977) as well as theoretically relevant discrete behavioral

markers.

Further, results of the present study provide clinicians with a level of confidence in

utilizing the MOA for assessing an individual’s quality of object representations, and the ability

to use performance-based data as an element of a dynamic formulation that includes quality, and

degree of, self-other differentiation, as well as a view of the individual’s prototypic relationship

patterns and basic fears in relating to others. Clinicians can focus on these object

representational deficits in the here-and-now process of psychotherapy. Although it is

inadvisable to use the data from the MOA as the sole source of information regarding object

representations, the information obtained can be utilized in conjunction with multiple sources of

data to anticipate potential therapeutic stalemates as well as avenues for repairing ruptures in the

therapeutic alliance (Eubanks-Carter, Muran, & Safran 2010; Safran & Muran, 2000).


References

References marked with an asterisk indicate publications included in the MOA meta-analysis.

*Ackerman, S. J., Hilsenroth, M. J., Clemence, A. J., Weatherill, R., & Fowler, J. C. (2000). The

effects of social cognition and object representation on psychotherapy continuation.

Bulletin of the Menninger Clinic, 64, 386-408.

Allison, J., Blatt, S. J., & Zimet, C. (1968). The interpretation of psychological tests. New York,

NY: Harper & Row.

American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders

(4th ed.). Washington, DC: Author.

Aronow, E., & Reznikoff, M. (1976). Rorschach content interpretation. New York, NY: Grune

& Stratton.

Aronow, E., Reznikoff, M., & Moreland, K. (1994). The Rorschach technique: Perceptual

basics, content interpretation, and applications. Needham Heights, MA: Allyn & Bacon.

*Baity, M. R., Blais, M. A., Hilsenroth, M. J., Fowler, J. C., & Padawer, J. R. (2009). Self-

mutilation, severity of borderline psychopathology, and the Rorschach. Bulletin of the

Menninger Clinic, 73, 203-225.

Begg, C. B., & Mazumdar, M. (1994). Operating characteristics of a rank correlation test for

publication bias. Biometrics, 50, 1088-1101. Retrieved from

http://www.biometrics.tibs.org/

Berg, J. L., Packer, A., & Nunno, V. J. (1993). A Rorschach analysis: Parallel disturbance in

thought and in self/object representation. Journal of Personality Assessment, 61, 311-323.

doi:10.1207/s15327752jpa6102_11

*Blais, M. A., Hilsenroth, M. J., Castlebury, F., Fowler, J. C., & Baity, M. R. (2001). Predicting



DSM-IV Cluster B personality disorder criteria from MMPI-2 and Rorschach data: A test

of incremental validity. Journal of Personality Assessment, 76, 150-168.

doi:10.1207/S15327752JPA7601_9

*Blais, M. A., Hilsenroth, M. J., Fowler, J. C., & Conboy, C. A. (1999). A Rorschach

exploration of the DSM-IV Borderline Personality Disorder. Journal of Clinical

Psychology, 55, 563-572. doi:10.1002/(SICI)1097-4679(199905)55:5<563::AID-

JCLP4>3.0.CO;2-7

Blatt, S. J. (2008). Polarities of experience: Relatedness and self-definition in personality

development, psychopathology, and the therapeutic process. Washington, DC: American

Psychological Association.

Blatt, S. J., Auerbach, J. S., & Levy, K. N. (1997). Mental representations in personality

development, psychopathology, and the therapeutic process. Review of General

Psychology, 1, 351-374. doi:10.1037/1089-2680.1.4.351

*Blatt, S. J., & Ford, R. Q. (1994). Therapeutic change: An object relations perspective. New

York, NY: Plenum Press.

*Blatt, S. J., Ford, R. Q., Berman, W., Cook, B., & Meyer, R. (1988). The assessment of change

during the intensive treatment of borderline and schizophrenic young adults.

Psychoanalytic Psychology, 5, 127-158. doi:10.1037/0736-9735.5.2.127

*Blatt, S. J., & Shahar, G. (2004). Stability of the patient-by-treatment interaction in the

Menninger Psychotherapy Research Project. Bulletin of the Menninger Clinic, 68, 23-38.

*Blatt, S. J., Tuber, S. B., & Auerbach, J. S. (1990). Representation of interpersonal interactions

on the Rorschach and level of psychopathology. Journal of Personality Assessment, 54,

711-728. doi:10.1080/00223891.1990.9674032


Blatt, S. J., Wiseman, H., Prince-Gibson, E., & Gatt, C. (1991). Object representations and

change in clinical functioning. Psychotherapy: Theory, Research, Practice, Training, 28,

273-283. doi:10.1037/0033-3204.28.2.273

Bombel, G. A. (2006). A meta-analysis of interrater scoring reliability for the Rorschach

Mutuality of Autonomy (MOA) Scale. Unpublished master’s thesis, University of

Toledo, Toledo, Ohio.

Bombel, G. A., Mihura, J. L., & Meyer, G. J. (2009). An examination of the construct validity of

the Rorschach Mutuality of Autonomy (MOA) Scale. Journal of Personality Assessment,

91, 227-237. doi:10.1080/00223890902794267

Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2005). Comprehensive

Meta-Analysis Version 2. Englewood, NJ: Biostat.

Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein. H. R. (2009). Introduction to meta-

analysis. Chichester, UK: John Wiley & Sons, Ltd.

Bornstein, R. F. (2010). Psychoanalytic theory as a unifying framework for 21st century

personality assessment. Psychoanalytic Psychology, 27, 133-152. doi:10.1037/a0015486

Bowlby, J. (1982). Attachment and loss: Vol. 1. Attachment (2nd ed.). New York, NY: Basic

Books. (Original work published 1969).

*Brown-Cheatham, M. (1993). The Rorschach Mutuality of Autonomy Scale in the assessment

of black father-absent male children. Journal of Personality Assessment, 61, 524-530.

doi:10.1207/s15327752jpa6103_8

Cicchetti, D. V. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and

standardized assessment instruments in psychology. Psychological Assessment, 6, 284-

290. doi:10.1037/1040-3590.6.4.284


Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Mahwah, NJ:

Lawrence Erlbaum Associates.

*Cook, B., Blatt, S. J., & Ford, R. Q. (1995). The prediction of therapeutic response to long-term

intensive treatment of seriously disturbed young adult inpatients. Psychotherapy Research,

5, 218-230. doi:10.1080/10503309512331331326

Cramer, P., & Blatt, S. J. (1990). Use of the TAT to measure change in defense mechanisms

following intensive psychotherapy. Journal of Personality Assessment, 54, 236-251.

doi:10.1080/00223891.1990.9673990

Cramer, P., Blatt, S. J., & Ford, R. Q. (1988). Defense mechanisms in the anaclitic and

introjective personality configuration. Journal of Consulting and Clinical Psychology, 56,

610-616. doi:10.1037/0022-006X.56.4.610

Diener, M. J. (2009). Effect Size Calculator from Raw Data to r. (Version 1.3) [Computer

Software]. Arlington, VA: American School of Professional Psychology, Argosy

University, Washington DC.

Duval, S., & Tweedie, R. (2000a). A nonparametric ‘trim and fill’ method of accounting for

publication bias in meta-analysis. Journal of the American Statistical Association, 95,

89-98. doi:10.2307/2669529

Duval, S., & Tweedie, R. (2000b). Trim and fill: A simple funnel-plot-based method of testing

and adjusting for publication bias in meta-analysis. Biometrics, 56, 455-463. Retrieved

from http://www.biometrics.tibs.org/

Egger, M., Davey Smith, G., Schneider, M., & Minder, C. (1997). Bias in meta-analysis detected

by a simple, graphical test. British Medical Journal, 315, 629-634.

doi:10.1136/bmj.315.7109.629



Eubanks, C., Muran, J. C., & Safran, J. D. (2010). Alliance ruptures and resolution. In J. C.

Muran and J. P. Barber (Eds.), The therapeutic alliance: An evidence-based guide to

practice (pp. 74-96). New York, NY: Guildford Press.

Fertuck, E. A., Bucci, W., Blatt, S. J., & Ford, R. Q. (2004). Verbal representation and

therapeutic change in anaclitic and introjective patients. Psychotherapy: Theory,

Research, Practice, Training, 41, 12-25. doi:10.1037/0033-3204.41.1.13

Field, A. P. (2001). Meta-analysis of correlation coefficients: A monte carlo comparison of

fixed- and random-effects methods. Psychological Methods, 6, 161-180.

doi:10.1037/1082-989X.6.2.161

Field, A. P. (2005). Is the meta-analysis of correlation coefficients accurate when population

correlations vary? Psychological Methods, 10, 444-467. doi:10.1037/1082989X.10.4.444

Fleiss, J. L. (1981). Statistical methods for rates and proportions (2nd ed.). New York, NY:

Wiley.

*Fowler, J. C., Ackerman, S. J., Speanburg, S., Bailey, A., Blagys, M., & Conklin, A. C. (2004).

Personality and symptom change in treatment-refractory inpatients: Evaluation of the

phase model of change using Rorschach, TAT, and DSM-IV Axis V. Journal of

Personality Assessment, 83, 306-322. doi:10.1207/s15327752jpa8303_12

*Fowler, J. C., Brunnschweiler, B., & Brock, J. (2002). Exploring the inner world of severely

disturbed bulimic women: Empirical investigations of psychoanalytic theory of female

development. In R. F. Bornstein & J. M. Masling (Eds.), The psychodynamics of gender

and gender role (pp. 129-163). Washington, DC: American Psychological Association.

Fowler, J. C., & Erdberg, P. (2006). The Mutuality of Autonomy Scale: An implicit measure of


object relations for the Rorschach Inkblot Method. South African Rorschach Journal, 2,

3-10. Retrieved from http://www.ikpp.si/izobrazevanje-in-usposabljanje/publikacije

*Fowler, J. C., Hilsenroth, M. J., & Handler, L. (1996). A multimethod approach to assessing

dependency: The early memory dependency probe. Journal of Personality Assessment,

67, 399-413. doi:10.1207/s15327752jpa6702_13

*Fowler, J. C., Hilsenroth, M. J., & Nolan, E. (2000). Exploring the inner world of self-

mutilating borderline patients: A Rorschach investigation. Bulletin of the Menninger

Clinic, 64, 365-385.

Freud, S. (1957a). On narcissism. In J. Strachey (Ed. and Trans.), The standard edition of the

complete psychological works of Sigmund Freud (Vol. XIV, pp. 67-102). London:

Hogarth. (Original work published 1914).

Freud, S. (1957b). Three essays on the theory of sexuality. In J. Strachey (Ed. and Trans.), The

standard edition of the complete psychological works of Sigmund Freud (Vol. VII, pp.

125-245). London: Hogarth. (Original work published 1905).

*Gerard, S. M., Jobes, D., Cimbolic, P., Ritzler, B. A., & Montana, S. (2003). A Rorschach study

of interpersonal disturbance in priest child molesters. Sexual Addiction & Compulsivity,

10, 53-66. doi:10.1080/10720160390186312

*Goddard, R., & Tuber, B. (1989). Boyhood separation anxiety disorder: Thought disorder and

object relations psychopathology as manifested in Rorschach imagery. Journal of


*Goldberg, E. H. (1989). Severity of depression and developmental levels of psychological

functioning in 8-16-year-old girls. American Journal of Orthopsychiatry, 59, 167-178.

doi:10.1111/j.1939-0025.1989.tb01648.x

http://www.ikpp.si/izobrazevanje-in-usposabljanje/publikacije


Gorney, J. E., & Weinstock, S. (1980). Borderline object relations, therapeutic impasse, and the

Rorschach. In J. Kwawer, H. Lerner, P. Lerner, & A. Sugarman (Eds.), Borderline

phenomena and the Rorschach test (pp. 167-187). New York, NY: International

Universities Press, Inc.

*Greco, C. M., & Cornell, D. G. (1992). Rorschach object relations of adolescents who

committed homicide. Journal of Personality Assessment, 59, 574-583.

doi:10.1207/s15327752jpa5903_11

Greenberg, J. R., & Mitchell, S. A. (1983). Object relations in psychoanalytic theory.

Cambridge, MA: Harvard University Press.

*Harder, D. W., Greenwald, D. F., Wechsler, S., & Ritzler, B. A. (1984). The Urist Rorschach

Mutuality of Autonomy Scale as an indicator of psychopathology. Journal of Clinical

Psychology, 40, 1078-1083. doi:10.1002/1097-4679(198407)40:4<1078::AID-

JCLP2270400438>3.0.CO;2-T

*Hart, B., & Hilton, I. (1988). Dimensions of personality organization as predictors of teenage

pregnancy risk. Journal of Personality Assessment, 52, 116-132.

doi:10.1207/s15327752jpa5201_11

Hedges, L. V., & Vevea, J. L. (1998). Fixed- and random-effects models in meta-analysis.

Psychological Methods, 3, 486-504. doi:10.1037/1082-989X.3.4.486

Hemphill, J. F. (2003). Interpreting the magnitudes of correlation coefficients. American

Psychologist, 58, 78-79. doi:10.1037/0003-066X.58.1.78

*Hibbard, S., Porcerelli, J., Kamoo, R., Schwartz, M., & Abell, S. (2010). Defense and object

relational maturity on Thematic Apperception Test scales indicate levels of personality

organization. Journal of Personality Assessment, 92, 241-253.


doi:10.1080/00223891003670190

Higgins, J. P. T., Thompson, S. G., Deeks, J. J., & Altman, D. G. (2003). Measuring

inconsistency in meta-analyses. British Medical Journal, 327, 557-560.

doi:10.1136/bmj.327.7414.557

Hiller, J. B., Rosenthal, R., Bornstein, R. F., Berry, D. T. R., Brunell-Neuleib, S. (1999). A

comparative meta-analysis of Rorschach and MMPI validity. Psychological Assessment,

11, 278-296. doi:10.1037/1040-3590.11.3.278

Hilsenroth, M. J., & Charnas, J. W. (2007). Training manual for Rorschach interrater reliability

(2nd ed.). Unpublished manuscript, The Derner Institute of Advanced Psychological

Studies, Adelphi University, Garden City, NY.

Hilsenroth, M. J., Handler, L., Toman, K. M., & Padawer, J. R. (1995). Rorschach and MMPI-2

indices of early psychotherapy termination. Journal of Consulting and Clinical Psychology,

63, 956-965. doi:10.1037/0022-006X.63.6.956

Høglend, P., Hersoug, A. G., Bøgwald, K., Amlo, S., Marble, A., Sørbye, Ø., . . . Crits-

Christoph, P. (2011). Effects of transference work in the context of therapeutic alliance and

quality of object relations. Journal of Consulting and Clinical Psychology, 79, 697-706.

doi:10.1037/a0024863

Holt, R. R. (2009). Primary process thinking: Theory, measurement, and research. New York,

NY: Jason Aronson.

Horvath, A. O., & Symonds, B. D. (1991). Relation between working alliance and outcome in

psychotherapy: A meta-analysis. Journal of Counseling Psychology, 38, 139-149.

doi:10.1037/0022-0167.38.2.139

Jaffe, L. S. (1990). The empirical foundations of psychoanalytic approaches to psychological


testing. Journal of Personality Assessment, 55, 746-755.

doi:10.1080/00223891.1990.9674109

Jaffe, L. S. (1992). The impact of theory on psychological testing: How psychoanalytic theory

makes diagnostic testing more enjoyable and rewarding. Journal of Personality

Assessment, 58, 621-630. doi:10.1207/s15327752jpa5803_15

*Kavanagh, G. G. (1985). Changes in patients' object representations during psychoanalysis and

psychoanalytic psychotherapy. Bulletin of the Menninger Clinic, 49(6), 546-564.

Kernberg, O. F. (1966). Structural derivatives of object relationships. International Journal of

Psycho-Analysis, 47, 236-252.

Kernberg, O. F. (1970). A psychoanalytic classification of character pathology. Journal of the

American Psychoanalytic Association, 18, 800-822.

Kernberg, O. F. (1975). Borderline conditions and pathological narcissism. New York, NY:

Aronson.

Kissen, M. (1986). Assessing object relations phenomena. Madison, CT: International


Kleiger, J. H. (1999). Disordered thinking and the Rorschach. Hillsdale, NJ: The Analytic Press.

Kohut, H. (1966). Forms and transformations of narcissism. Journal of the American

Psychoanalytic Association, 14, 243-272.

Kohut, H. (1971). The analysis of the self. New York, NY: International Universities Press, Inc.

Kwawer, J., Lerner, H., Lerner, P. M., & Sugarman, A. (1980). Borderline phenomena and the

Rorschach test. New York, NY: International Universities Press, Inc.

Leichtman, M. (1996). The Rorschach: A developmental perspective. Hillsdale, NJ: The

Analytic Press.


*Leifer, M., Shapiro, J. P., Martone, M. W., & Kassem, L. (1991). Rorschach assessment of

psychological functioning in sexually abused girls. Journal of Personality Assessment, 56,

14-28. doi:10.1207/s15327752jpa5601_2

Lerner, H. (1986). An object representational approach to Rorschach assessment. In M.

Kissen (Ed.), Assessing object relations phenomena, (pp. 127-142). Madison, CT:

International Universities Press.

Lerner, H., & Lerner, P. M. (1988). Primitive mental states and the Rorschach. New York, NY:

International Universities Press, Inc.

Lerner, P. M. (1991). Psychoanalytic theory and the Rorschach. Hillsdale, NJ: The Analytic

Press.

Lerner, P. M. (1992). Toward an experimental psychoanalytic approach to the Rorschach.

Bulletin of the Menninger Clinic, 56, 451-464.

Lerner, P. M. (1996). Current perspective on psychoanalytic Rorschach assessment. Journal of


Lerner, P. M. (1998). Psychoanalytic perspectives on the Rorschach. Hillsdale, NJ: The Analytic

Press.

Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage

Publications.

Lukowitsky, M. R., & Pincus, A. L. (2011). The pantheoretical nature of mental representations

and their ability to predict interpersonal adjustment in a nonclinical sample.

Psychoanalytic Psychology, 28, 48-74. doi:10.1037/a0020849

Mahler, M. (1952). On childhood psychoses and schizophrenia: Autistic and symbiotic infantile

psychoses. The Psychoanalytic Study of the Child, 7, 286-305.


Masling, J. (1983). Empirical studies of psychoanalytic theories (Vol. 1). Hillsdale, NJ: The

Analytic Press.

Mayman, M. (1967). Object-representations and object-relationships in Rorschach responses.

Journal of Projective Techniques and Personality Assessment, 31, 17-24.

doi:10.1080/0091651X.1967.10120387

Mayman, M. (1968). Early memories and character structure. Journal of Projective Techniques

and Personality Assessment, 32, 303-316. doi:10.1080/0091651X.1968.10120488

Martin, D., Garske, J., & Davis, M. (2000). Relation of the therapeutic alliance with outcome and

other variables: A meta-analytic review. Journal of Consulting and Clinical Psychology,

68, 438-450. doi:10.1037/0022-006X.68.3.438

*Mazor, A., Alfa, A., & Gampel, Y. (1993). On the thin blue line between connection and

separation: The individuation process, from cognitive and object-relations perspectives,

in kibbutz adolescents. Journal of Youth and Adolescence, 22, 641-669.

doi:10.1007/BF01537136

Meyer, G. J., McGrath, R. E., & Rosenthal, R. (2003). Basic effect size guide with SPSS and

SAS syntax. Retrieved from

http://psychology.utoledo.edu/Images/BasicESGuideTemp9.rtf

Meyer, G. J., Viglione, D. J., Mihura, J. L., Erard, R. E., & Erdberg, P. (2011). Rorschach

Performance Assessment System: Administration, coding, interpretation, and

technical manual. Toledo, OH: Rorschach Performance Assessment System,

LLC.

Mihura, J. L., Meyer, G. J., Dumitrascu, N., & Bombel, G. (2012). The validity of

individual Rorschach variables: Systematic reviews and meta-analyses of the

http://psychology.utoledo.edu/Images/BasicESGuideTemp9.rtf


Comprehensive System. Psychological Bulletin. Advance online publication.

doi:10.1037/a0029406.

*Mihura, J. L., Nathan-Montano, E., & Alperin, R. J. (2003). Rorschach measures of aggressive

drive derivatives: A college student sample. Journal of Personality Assessment, 80, 41-

49. doi:10.1207/S15327752JPA8001_12

Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & the PRISMA Group (2009). Preferred

reporting items for systematic reviews and meta-analyses: The PRISMA statement.

British Medical Journal, 339, 332-336. doi:10.1136/bmj.b2535.

*Munczek, D. S., & Tuber, S. (1998). Political repression and its psychological effects on

Honduran children. Social Science & Medicine, 47, 1699-1713. doi:10.1016/S0277-

9536(98)00252-4

*Murray, J. F. (1985). Borderline manifestations in the Rorschachs of male transsexuals. Journal

of Personality Assessment, 49, 454-466. doi:10.1207/s15327752jpa4905_1

National Research Council. (1992). Combing information: Statistical issues and opportunities

for research. Washington, DC: National Academy Press.

Orwin, R. G., & Vevea, J. L. (2009). Evaluating coding decisions. In H. Cooper, L. V. Hedges,

& J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed.;

pp. 177-203). New York, NY: Russell Sage Foundation.

Piper, W. E., Azim, H. F. A., Joyce, A. S., McCallum, M., Nixon, G. W. H., & Segal, P. S.

(1991). Quality of object relations versus interpersonal functioning as predictors of

therapeutic alliance and psychotherapy outcome. Journal of Nervous and Mental

Diseases, 179, 432-438. doi:10.1097/00005053-199107000-00008

Rapaport, D., Gill, M. M., & Schafer, R. (1945-1946). Diagnostic psychological testing: The


theory, statistical evaluation, and diagnostic application of a battery of tests (Vol. 1-2).

Chicago, IL: The Year Book Publishers, Inc.

Rapaport, D., Gill, M. M., & Schafer, R. (1968). Diagnostic psychological testing. Madison, CT:

International Universities Press, Inc.

Rorschach, H. (1951). Psychodiagnostics: A diagnostic test based on perception (5th ed.).

Oxford, England: Grune & Stratton. (Original work published 1921).

Rosenthal, R. (1991). Meta-analytic procedures for social research (Rev. ed.). Newbury Park,

CA: Sage.

Rosenthal, R, & DiMatteo, M. R. (2001). Meta-analysis: Recent developments in quantitative

methods for literature reviews. Annual Review of Psychology, 52, 59-82.

doi:10.1146/annurev.psych.52.1.59

Rosenthal, R., Hiller, J. B., Bornstein, R. F., Berry, D. T. R., & Brunell-Neuleib, S. (2001).

Meta-analytic methods, the Rorschach, and the MMPI. Psychological Assessment, 13,

449-451. doi:10.1037/1040-3590.13.4.449

Rosenthal, R., & Rosnow, R. L. (2007). Essentials of behavioral research: Methods and data

analysis (3rd ed.). New York: McGraw-Hill

Ryan, E. R., & Cicchetti, D. V. (1985). Predicting quality of alliance in the initial psychotherapy

interview. Journal of Nervous and Mental Diseases, 173, 717-725.

doi:10.1097/00005053-198512000-00002

*Ryan, R. M., Avery, R. R., & Grolnick, W. S. (1985). A Rorschach assessment of children's

mutuality of autonomy. Journal of Personality Assessment, 49, 6-12.

doi:10.1207/s15327752jpa4901_2

Safran, J. D., & Muran, J. C. (2000). Negotiating the therapeutic alliance: A relational


treatment guide. New York, NY: Guilford Press.

*Salyer, K. M., Holmstrom, R. W., & Noshpitz, J. D. (1991). Learning disabilities as a childhood

manifestation of severe psychopathology. American Journal of Orthopsychiatry, 61, 230-

240. doi:10.1037/h0079244

Schachtel, E. G. (1966). Experimental foundations of Rorschach’s test. London, UK: Tavistock.

Schafer, R. (1948). Clinical application of psychological tests. New York, NY: International


Schafer, R. (1954). Psychoanalytic interpretation in Rorschach testing. New York, NY: Grune &

Stratton.

Schafer, R. (1967). Projective testing and psychoanalysis. New York, NY: International


Selcuk, E., Zayas, V., Günaydin, G., Hazan, C., & Kross, E. (2012). Mental representations of

attachment figures facilitate recovery following upsetting autobiographical memory

recall. Journal of Personality and Social Psychology. Advanced online publication.

doi:10.1037/a0028125.

*Shahar, G., Blatt, S. J., & Ford, R. Q. (2003). Mixed anaclitic-introjective psychopathology in

treatment-resistant inpatients undergoing psychoanalytic psychotherapy. Psychoanalytic

Psychology, 20, 84-102. doi:10.1037/0736-9735.20.1.84

Silverstein, M. L. (1999). Self psychology and diagnostic testing: Identifying selfobject functions

through psychological testing. Mahwah, NJ: Lawrence Erlbaum Associates, Publishers.

*Spear, W. E., & Sugarman, A. (1984). Dimensions of internalized object relations in borderline

and schizophrenic patients. Psychoanalytic Psychology, 1, 113-129.

doi:10.1037/0736-9735.1.2.113


Sterne, J. A. C., & Egger, M. (2001). Funnel plots for detecting bias in meta-analysis: Guidelines

on choice of axis. Journal of Clinical Epidemiology, 54, 1046-1055. Retrieved from

http://www.jclinepi.com/

Sterne, J. A. C., & Harbord, R. M. (2004). Funnel plots in meta-analysis. The Stata Journal, 4,

127-141. Retrieved from http://www.stata-journal.com/

*Strauss, J., & Ryan, R. M. (1987). Autonomy disturbances in subtypes of anorexia nervosa.

Journal of Abnormal Psychology, 96, 254-258. doi:10.1037/0021-843X.96.3.254

Sugarman, A. (1991). Where’s the beef? Putting personality back into personality assessment.

Journal of Personality Assessment, 56, 130-144. doi:10.1207/s15327752jpa5601_12

Sugarman, A., & Kanner, K. (2000). The contribution of psychoanalytic theory to psychological

testing. Psychoanalytic Psychology, 17, 3-23. doi:10.1037/0736-9735.17.1.3

*Tuber, S. B. (1983). Children's Rorschach scores as predictors of later adjustment. Journal of

Consulting and Clinical Psychology, 51, 379-385. doi:10.1037/0022-006X.51.3.379

*Tuber, S. B., & Coates, S. (1989). Indices of psychopathology in the Rorschachs of boys with

severe gender identity disorder: A comparison with normal control subjects. Journal of


*Tuber, S. B., Frank, M. A., & Santostefano, S. (1989). Children's anticipation of impending

surgery: Shifts in object-representational paradigms. Bulletin of the Menninger Clinic, 53,

501-511.

*Urist, J. (1977). The Rorschach test and the assessment of object relations. Journal of


*Urist, J., & Shill, M. (1982). Validity of the Rorschach Mutuality of Autonomy Scale: A

replication using excerpted responses. Journal of Personality Assessment, 46, 450-

http://www.jclinepi.com/

http://www.stata-journal.com/


454. doi:10.1207/s15327752jpa4605_1

Weiner, I. B. (1966). Psychodiagnosis in schizophrenia. New York, NY: John Wiley & Sons,

Inc.

Weiner, I. B. (1991). Editor’s note: Interscorer agreement in Rorschach research. Journal of

Personality Assessment, 56, 1. doi:10.1207/s15327752jpa5601_1

*Zodan, J., Charnas, J., & Hilsenroth, M. J. (2009). Rorschach assessment of reality testing,

affect and object representation of borderline pathology: A comparison of clinical

samples. Bulletin of the Menninger Clinic, 73, 121-142.


Table 1

Overall Random Effects Meta-Analysis of the Mutuality of Autonomy Scale Criterion Validity

95% confidence interval

Study Name Overall N

Overall Effect Size

Lower Upper Limit Limit

Z-value p-value

Ackerman et al. (2000); Baity et al. (2009); Blais et al. (1999); Blais et al. (2001)a

78 .20 -.03 .41 1.74 .08

Blatt et al. (1988); Blatt et al. (1990); Blatt & Ford (1994); Cook et al. (1995); Shahar et al. (2003)a

90 .13 -.08 .32 1.23 .22

Brown-Cheatham (1993) 40 .38 .08 .62 2.44 .02 Fowler et al. (1996) 65 .16 -.09 .39 1.27 .20 Fowler et al. (2000); Fowler et al. (2002); Fowler et al. (2004)a

79 .24 .02 .44 2.16 .03

Gerard et al. (2003) 125 .13 -.05 .30 1.42 .16 Goddard & Tuber (1989); Tuber & Coates (1989)a

35 .22 -.13 .52 1.24 .21

Goldberg (1989) 100 .42 .25 .57 4.44 <.001 Greco & Cornell (1992) 97 -.15 -.34 .05 -1.47 .14 Harder et al. (1984) 59 .32 .07 .53 2.48 .01 Hart & Hilton (1988) 119 .11 -.09 .30 1.10 .27 Hibbard et al. (2010) 155 .23 .08 .37 2.89 <.005 Kavanagh (1985); Blatt & Shahar (2004)a

33 .02 -.32 .36 0.12 .91

Leifer et al. (1991) 64 .28 .04 .49 2.24 .03 Mazor et al. (1993) 60 .16 -.10 .40 1.22 .22 Mihura et al. (2003) 70 .22 -.01 .43 1.84 .07 Munczek & Tuber (1998) 27 .35 -.04 .64 1.79 .07 Murray (1985) 50 .38 .11 .60 2.74 .01 Ryan et al. (1985) 60 .18 -.08 .41 1.33 .18 Salyer et al. (1991) 47 .33 .05 .56 2.27 .02 Spear & Sugarman (1984) 54 .28 .01 .51 2.01 .04 Strauss & Ryan (1987) 46 .34 .06 .58 2.34 .02 Tuber (1983) 70 .24 .01 .45 2.01 .05 Tuber et al. (1989) 28 .45 .09 .71 2.43 .02 Urist (1977) 40 .50 .22 .70 3.33 <.005 Urist & Shill (1982) 60 .43 .20 .62 3.51 <.001 Zodan et al. (2009) 51 .26 -.01 .50 1.87 .06 Overall Weighted Mean r 1,803 .24 .18 .29 8.16 <.001 aIn order to maintain the statistical assumption of independence, data from these studies which contained identical or overlapping samples were aggregated.


Table 2 Moderator Codes for Each Independent Sample

Study QIRM DB ADMIN S METHOD PATH Age Sample Type % White % Male Ackerman et al. (2000); Baity et al. (2009); Blais et al. (1999); Blais et al. (2001)a

Low Full Blinding CS Urist Medium ---- Adult ---- ----

Blatt et al. (1988); Blatt et al. (1990); Blatt & Ford (1994); Cook et al. (1995); Shahar et al. (2003)a

Low Full Blinding Standard Urist High ---- Adult ---- 50.00

Brown-Cheatham (1993)

Low NR or none Cannot Tell Urist Normals 9.20 Adolescent 0.00 100.00

Fowler et al. (1996) Low Full Blinding CS Urist Medium 35.00 Adult 94.00 26.15 Fowler et al. (2000); Fowler et al. (2002); Fowler et al. (2004)a

High Full Blinding CS Urist High ---- Adult ---- ----

Gerard et al. (2003) Low NR or none Cannot Tell Urist High ---- Adult ---- 100.00 Goddard & Tuber (1989); Tuber & Coates (1989)a

Low Full Blinding Standard Urist Medium ---- Adolescent ---- 100.00

Goldberg (1989) Low NR or none Cannot Tell Urist Medium 11.46 Adolescent 7.00 0.00 Greco & Cornell (1992) High Full Blinding Cannot Tell Urist Medium 15.90 Adolescent 20.00 90.91

Harder et al. (1984) Low Partial Blinding Nonstandard Other High 38.48 Adult ---- 31.67 Hart & Hilton (1988) Low Partial Blinding Standard Urist Normals 19.30 Adolescent ---- 0.00

Hibbard et al. (2010) Low Partial Blinding Standard Urist Medium 32.29

Adult 64.52 48.39

Kavanagh (1985); Blatt & Shahar (2004)a

Low Full Blinding Standard Urist High 32.00 Adult ---- 51.52

Leifer et al. (1991) Low Full Blinding CS Urist Medium ---- Adolescent 0.00 0.00 Mazor et al. (1993) Low NR or none Cannot Tell Urist Normals 15.50 Adolescent ---- 46.67 Mihura et al. (2003) High NR or none CS Urist Normals 23.80 Adult 76.00 20.00


Note. QIRM = quality of inter-rater reliability for MOA; DB = degree of blinding of MOA scorer to the criterion variables; NR = not reported; ADMIN = system utilized for Rorschach administration; S METHOD = referenced MOA scoring method; PATH = severity of pathology; AGE = subject mean age or midpoint of age range; ---- = not reported or cannot be determined; CS = Comprehensive System. Moderator codes for the effect size level variables are not presented in this table due to space considerations but are available upon request from the first author. Please see Appendix B for sample and effect size level coding and variable descriptions. aData from these studies were aggregated due to identical or overlapping samples. If the studies varied in ratings for a particular moderator variable, the lower rating was assigned to be conservative.

Munczek & Tuber (1998)

Low Full Blinding Nonstandard Urist & Schill Medium 15.04 Adolescent 0.00 55.56

Murray (1985) Low Full Blinding Cannot Tell Urist Medium ---- Adult 83.00 100.00 Ryan et al. (1985) Low NR or none Standard Urist Normals ---- Adult ---- 50.00 Salyer et al. (1991) Low Full Blinding Cannot Tell Urist Low ---- Adolescent 100.00 100.00 Spear & Sugarman (1984)

Low Full Blinding Standard Other High ---- Adult ---- ----

Strauss & Ryan (1987) High Full Blinding Cannot Tell Urist Medium 20.80 Adult 100.00 0.00 Tuber (1983) Low Full Blinding Cannot Tell Urist High ---- Adult ---- ---- Tuber et al. (1989) High Full Blinding Cannot Tell Urist Normals 9.00 Adolescent 100.00 100.00 Urist (1977) Low NR or none Cannot Tell Urist High 37.00 Adult ---- 45.00 Urist & Shill (1982) Low NR or none Cannot Tell Urist & Schill Medium 15.33 Adolescent ---- 50.00 Zodan et al. (2009) High Full Blinding CS Urist Medium 28.20 Adult ---- 19.61


Table 3

Inter-rater reliability results for the Mutuality of Autonomy Scale meta-analysis

Variable % Agreement Kappa ICC (1, 1)a ICC (2, 1)b ICC (3, 1)c

QIRM ---- ---- .82 .82 .82 STDL SMPLSZ ---- ---- .97 .97 .97 S METHOD 80.00 ---- ---- ---- ---- PUBTYPE 100.00 ---- ---- ---- ---- DB ---- ---- .75 .75 .75 ADMIN ---- .30d ---- ---- ---- PATH ---- ---- .71 .71 .71 AGE ---- ---- .80 .80 .80 SAMPLE ---- 1.00 ---- ---- ---- RACE ---- ---- .83 .83 .83 GENDER ---- ---- 1.00 1.00 1.00 ABS ESe ---- ---- 1.00 1.00 1.00 ESL SMPLSZe ---- ---- 1.00 1.00 1.00 SIGNe 100.00 ---- ---- ---- ---- CVSe 84.00 ---- ---- ---- ---- MOA SCORe 94.00 ---- ---- ---- ---- Note. QIRM = quality of inter-rater reliability codings for MOA; STDL SMPLSZ = study level sample size; S METHOD = referenced MOA scoring method; PUBTYPE = publication type; DB = degree of blinding of MOA scorer to the criterion variable(s); ADMIN = Rorschach system method utilized for administration; PATH = severity of pathology; AGE = subject mean age or midpoint of reported age range; SAMPLE = sample type; RACE = percent white; GENDER = percent male; ABS ES = absolute value of the effect size; ESL SMPLSZ = effect size level sample size; SIGN = sign of the effect size (i.e., “1’ if the effect is in the predicted direction of the meta-analysis and a “-1” if it is in the opposite direction of the prediction of the meta-analysis); CVS = criterion variable source; MOA SCOR = specific MOA scaled score utilized; ---- = not applicable. aICC (1, 1) = Model 1 (one-way random effects) of intraclass correlation coefficient. bICC (2, 1) = Model 2 (two-way random effects) of intraclass correlation coefficient. cICC (3, 1) = Model 3 (two-way mixed effects) of intraclass correlation coefficient. dSeveral studies utilized archival data samples and re-scored the Rorschach with the Comprehensive System. This caused some initial discrepancy between raters regarding the difference between Rorschach system utilized for administration versus Rorschach system utilized for post-administration scoring. Discussion between raters for consensus coding clarified this issue. eThis variable was coded at the effect size level rather than the study level; as a result, inter-rater reliability analyses for this variable were conducted utilizing only those effect sizes (N = 59) in which both raters agreed were present. The inter-rater reliability for the presence/absence of effect sizes was 71.08% (i.e., 59/83). For all remaining variables in this table, N = 27 for the inter-rater reliability analyses since these variables were coded at the study level.


Table 4 Subgroup random effects meta-analytic results by specific MOA scale scores MOA Scale Score k r p 95% CI

MOA Highest 12 .24 <.001 .14-.34 MOA Lowest 10 .21 <.001 .10-.32 MOA Mean 19 .30 <.001 .22-.37 MOA Path 4 .27 <.001 .10-.42 MOA Other 10 .24 <.001 .13-.35 Note. k = number of studies; r = weighted average effect size; 95% CI = 95% confidence interval for the average weighted effect size.


Table 5

Subgroup random effects meta-analytic results by source of criterion variable

Criterion Variable Source k r p 95% CI Group differences on the MOA between clinical and nonclinical populations

6 .33 <.001 .21-.43

Diagnostic group differences on the MOA 6 .19 <.001 .07-.29 Differences on the MOA between groupings based on behavioral criteria

8 .24 <.001 .14-.34

Discrete behavioral markers 2 .37 <.001 .15-.56 Psychotherapy outcome change data 3 .16 <.001 -.01-.32 Level of symptoms or overall functioning 7 .27 <.001 .17-.37 Ratings of mutuality of autonomy 2 .52 <.001 .33-.67 Ratings of intrapsychic functioning 2 .10 <.001 -.10-.30 Ratings of interpersonal functioning 6 .11 <.001 -.01-.22 Ratings of academic performance 1 .17 <.001 -.13-.44 Note. k = number of studies; r = average weighted effect size; 95% CI = 95% confidence interval for the average weighted effect size.


Figure 1 Flow diagram for the Meta-Analysis of the Rorschach Mutuality of Autonomy Scale Criterion Validity

Note. Adapted from flow diagram in Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., and the PRISMA Group (2009). Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement. British Medical Journal, 339, 332-336. doi:10.1136/bmj.b2535 aDue to several instances of identical or overlapping studies, the number of independent samples utilized in the meta-analysis was n = 27.

Abstracts identified through

PsycINFO and MEDLINE

electronic database searches

(n = 767)

Additional records identified through

other sources (e.g., manual JPA search,

review articles and book chapters;

backwards reference checks)

(n = 101)

Records screened

(n = 868)

Full-text studies

assessed for eligibility

(n = 260)

Publications included in the

meta-analysis

(n = 38)a

Full-text studies

excluded

(n = 222)

Records excluded

(n = 608)


Figure 2 Graphical representation of potential publication bias

Note. The white circles represent each of the independent studies we actually included plotted by the size of the effect (in Fisher’s Z) on the horizontal axis and the standard error on the vertical axis. If there was evidence of publication bias, we would expect that the largest effects would have the largest standard error (yielding void in the lower left quadrant; Borenstein et al., 2009). Studies toward the tip of the triangle have the smallest standard error. The white diamond represents the weighted average effect size of the actual studies included in the meta-analysis. The following iterative procedure was utilized for the trim and fill: studies at the extreme positive side of the graph are removed, the weighted average effect is recalculated, and this process of trimming the plot continues until the distribution of studies is symmetric around the weighted average effect. Next, each removed study is added back in and a mirror image of the study is imputed to correct for reduction in the variance of effects as a result of the trimming procedure. The black diamond represents the weighted average effect size calculated using the studies actually included in the meta-analysis as well as the imputed studies. In the present meta-analysis, there were nine studies that needed to be trimmed and therefore the observed and imputed overall effect sizes are not identical.


Appendix A

Rorschach Mutuality of Autonomy (MOA) Scale7 The Mutuality of Autonomy on the Rorschach developed by Urist (1977) is a scale based on a developmental model that defines various levels or stages of relatedness based on a sense of individual autonomy and the capacity to establish mutuality. Rorschach responses are scored on this 7-point scale if a relationship is stated or clearly implied between animate (people or animals) or inanimate objects. A response is scored even if there is only one animate or inanimate object, but a relationship is clearly implied. Thus, an object that is a consequence of an action (a flag torn in half, a moth shot by a shotgun or a squashed cat) or has the potential for an action on another object (a nuclear explosion) is scored in this analysis of Rorschach responses. Urist (1977) defines 7 scale points for the quality of relations between objects as follows: Scale Point 1: Figures are engaged in some relationship or activity where they are together and involved with each other in such a way that conveys a reciprocal acknowledgment of their respective individuality. The image contains explicit or implicit reference to the fact that the figures are separate and autonomous and involved with each other in a way that recognizes or expresses a sense of mutuality in the relationship (e.g., “two bears toasting each other, clinking glasses;” “two people having a heated political argument”). At this level, the unique contributions of each individual object to the mutual interaction need to be emphasized. Thus, “two people dancing” would receive a 2, because there is no stated emphasis on the mutuality of their endeavor. To receive a score of 1, a response must have a special emphasis on the mutual but separate nature of a dyadic interaction. Each object must maintain its unique identity and contribution to a relationship in which both objects are mutually engaged. Such as: “Two people doing a synchronized dance, like in a ritual ceremony for a wedding” would be scored a 1. This response indicates that the two people are well differentiated, as well as the need to be aware of the others placement and activity with relation to their own. Scale Point 2: Figures are engaged together in some relationship or parallel activity, but there is no stated emphasis of mutuality. There is no stated emphasis or highlighting of mutuality, nor on the other hand is there any sense that this dimension is compromised in any way within the relationship. Despite the lack of direct emphasis on mutuality, the response still conveys the potential for mutuality in the relationship (e.g., “two women doing their laundry”). A response is scored 2 when the integrity of the objects is maintained and there is a potential or an implicit capacity of mutuality, independent of the degree of logic, irrationality, or absurdity to the relationship. Responses such as “Two people eating” or “Animals climbing a tree” convey a sense of autonomy, but without the indication of an explicit recognition of the other’s independence. Both scales scores 1 & 2 are similar to Cooperative movement responses found in the Comprehensive System; however, inanimate movement is also scored in the Mutuality of Autonomy scale. Finally, it is important to note that two objects simply fighting are scored a 2. Only if one figure has an unequal, controlling, or imbalanced advantage over the other is such a response coded a higher score. Scale Point 3: Figures are dependent on each other but without an internal sense of capacity to sustain themselves; leaning or hanging on one another. The objects do not “stand on their own two feet;” rather, they each require some degree of external support or direction. The objects lack a sense of being firmly self-supporting (e.g., “two penguins leaning against a telephone pole”). Scale point 3 reflects dependent relationships in which one or both objects are reliant on the other for stability. Responses such as, “A friendly animal up here reaching down helping these bears up the side of a mountain” or “Two baby birds being fed by the mother bird” clearly indicates that objects do not function independently without external support. Scale Point 4: One figure is seen as the reflection, imprint, or symmetrical image of another. The relationship between objects conveys a sense that the definition or stability of an object exists only insofar as it is an extension or

7Excerpted with permission from pp.10-12 in Hilsenroth, M. J., & Charnas, J. W. (2007). Training manual for Rorschach interrater reliability (2nd ed.). Unpublished manuscript, The Derner Institute of Advanced Psychological Studies, Adelphi University, Garden City, NY. This training manual is freely available for download at http://www.ror-scan.com/RorschachTrainingManual2ndEd.pdf

http://www.ror-scan.com/RorschachTrainingManual2ndEd.pdf


reflection of another. Shadows, footprints, and so on would be included here, as well as responses of Siamese twins or two animals joined together. Scale point 4 captures the prototypic mirroring object relationship and often reveals an emerging loss of autonomy between figures where one object is seen as a reflection, an imprint or a mimetic of the other. Responses such as, “Siamese twins because they are connected at the waist,” “a wolverine looking at its reflection in the water,” or “A butler starring in the mirror and that’s his reflection” imply that relationships between objects exists only in so far as it is seen as a reflection or an extension of the other. Other examples include, “a smeared fingerprint” and “a shadow cast by a figure walking by.” Any Reflection response found in the Comprehensive System would be scored a 4, or perhaps greater if the content was decidedly violent and destructive. Scale Point 5: The nature of the relationship between figures is characterized by malevolent control of one figure by another. Themes of influencing, controlling, or casting spells may be present. One figure, either literally or figuratively, may be in the clutches of another. Such themes portray a severe imbalance in the mutuality of relations between figures. On the one hand, some figures seem powerless and helpless, while at the same time, others seem controlling and omnipotent. Themes of violation of an object's integrity through domination, malevolence and sense of one object controlled or forcibly influenced by another are often present in these types of responses (e.g., puppets on a string, witches casting a spell on someone). Scale Point 6: There is a severe imbalance in the mutuality of relations between figures in decidedly destructive terms, physical damage to the object is present (e.g., a door that has just been kicked in, a flag torn in half, a moth shot by a shotgun, a squashed cat or a bat impaled by a tree). Two figures more than simply fighting—such as a figure being tortured by another, or an object being strangled by another—are considered to reflect a serious attack on the autonomy of the object. Literal physical damage is seen as having occurred. Similarly, included here are relationships portrayed as parasitic, where a gain by one figure results by definition in the diminution or destruction of another (e.g., a leech sucking up this man's blood, two people feasting after killing this animal, a compression hammer splitting through rock). Many, but not all, Morbid content responses found in the Comprehensive System would be scored a 6 or 7. Scale Point 7: Relationships are characterized by an overpowering enveloping force. Figures are seen as swallowed up, devoured, or generally overwhelmed by forces completely beyond their control. Forces are described as overpowering, malevolent, perhaps even psychotic. Frequently, the force is described as existing outside of the relationship between two figures or objects, underscoring the massiveness of the force, its overwhelming nature, and the complete passivity and helplessness of the objects or figures involved (e.g., something being consumed by fire, destruction from some cataclysmic disaster (natural or manmade), or God's wrath). Scale point 7 reflects the complete loss of autonomy of one or more figures by overpowering diffuse and enveloping force (e.g., a tornado, volcano, or nuclear explosion hurtling its debris everywhere). Here the loss of autonomy results in more than just the death or physical damage of the object (as in Scale point 6) but rather its annihilation, such as that found in the following response: “An evil fog enveloping this frog. The poison is dissolving it.” Calculating and Summarizing MOA Data Each response may only receive one MOA score. When there is the potential for two possible scores to be assigned (e.g., Two Siamese twins doing an intricate waltz), the higher (more maladaptive) score is always given (e.g., 4 over a 1 in the example above). MOA-R: The number of responses where a MOA score occurs in the protocol (e.g., 1, 2, 2, 4, 5, 6 = 6).

MOA-Sum: The raw sum of all MOA scores found in a protocol (e.g., 1+2+2+4+5+6 = 15). MOA-Mean: MOA-Sum divided by MOA-R (e.g., 15/6 = 2.5) MOA-Low: The MOA score representing the single lowest (most adaptive) score found in the protocol (e.g., 1). MOA-High: The MOA score representing the single highest (least adaptive) score found in the protocol (e.g., 6). MOA-PATH: The sum of all Scale points 5, 6, & 7 that occur on a given protocol (e.g., 5, 6 = 2).


Appendix B

Coding Criteria for Moderator Analyses

Sample Level Variables ___ Quality of Inter-rater Reliability for MOA [QIRM]

0 = Not reported or no assessment of inter-rater reliability 1 = Inappropriate method; statements that reliability checks were made but without report of data; calculations not based on data from reported subjects; low agreement (mean Kappa or ICC <.60) ICC = ___ K= ___ Percent Agreement = ___ 2 = Kappa or ICC indicating high agreement (mean over >.59) ICC = ___ Kappa = ___

___ Number of Subjects [N] ___ Referenced MOA Scoring Method [S_METHOD]

1 = Urist (1977) 2 = Urist & Schill (1982) 3 = Holiday & Sparks (2002) 4 = Other: Spear & Sugarman (1984); Harder et al. (1984) 9 = Cannot tell

___ Degree of Blinding of MOA Scorer to the Criterion Variable(s) [DB]

0 = Not Reported or No Blinding 1 = Partial Blinding: Ambiguous report of blinding procedure, or a degree of blinding was applied, e.g., the same person collected all protocols and then scored them randomly 2 = Full Blinding: Scorer(s) completely blinded to criterion variable(s)

___ Administration of Rorschach System [ADMIN]

0 = Nonstandard System: Aronow; less than 10 cards 1 = Standard System: Rapaport, Gill, & Schafer; Klopfer; Beck; Holt; H. Rorschach; Allison, Blatt, & Zimet 2 = Comprehensive System 9 = Cannot tell

___ Severity of Pathology [PATH] 0 = College samples; normal subjects 1 = Low: Various groups of participants designated as “patients” without psychotic symptoms or designated as “neurotics” or “depressives” 2 = Medium: Designated as “outpatients;” or mixed groups of neurosis, psychosis, and/or characterological disorders; or a mixture of inpatients and outpatients; default category if not reported 3 = High: Inpatient populations; subjects with schizophrenic spectrum disorders; patients designed as “Psychotic”


__ __.__ __ Subject Age (Mean age as reported or midpoint of reported age range) [AGE; if not reported, code “999”]

___ Sample Type [SAMPLE]

1 = Adult 2 = Adolescent 3 = Child 4 = Geriatric 9 = Cannot tell

__ __.__ __ % White [RACE; code “999” if cannot tell] __ __.__ __ % Male [GENDER; code “999” if cannot tell] Effect Size Level Variables ___ Criterion Variable Source [CVS]

1 = Group differences on the MOA between clinical and non-clinical populations 2 = Diagnostic group differences (e.g., DSM-III/IV/IV-TR; differences in Blatt’s classification system; differences in Kernberg’s level of personality organization) 3 = Differences on the MOA between behavioral criteria groupings (e.g., self-mutilating vs. non-self-mutilating patients) 4 = Discrete behavioral markers (e.g., number of psychotherapy sessions attended; MOA changes as participants get closer in time or get farther away in time from surgery) 5 = Psychotherapy outcome change data 6 = Level of symptomatology or level of overall functioning (e.g., number of BPD criteria met; GAF, GARF, and SOFAS scores) 7 = Ratings of participant’s level of mutuality of autonomy based on (a) observed actual behaviors, (b) chart review (chart review cannot include access to Rorschach data), or (c) autobiographical writings 8 = Ratings of intrapsychic functioning (e.g., level of motivation for treatment, quality of object relations, sublimatory capacity, superego integration, etc.) 9 = Ratings of interpersonal functioning 10 = Ratings of academic performance (e.g., reflected in school grades which tap into social and motivational processes; not applicable to standardized achievement and intellectual testing which do not meet eligibility criteria for inclusion)

___ MOA Scale Score Utilized [MOASCORE]

1 = MOA Mean 2 = MOA Highest 3 = MOA Lowest 4 = MOA Path 5 = MOA Other: ___ 9 = Cannot tell

criterion validity of the rorschach mutuality of autonomy (moa) scale: a meta-analytic review

Documents