psykologisk testning oversigtsnoter - psyknoter · • block!design! • pictureconcepts! •...
TRANSCRIPT
STATISTISKE BEGREBER Measures of central tendency
MODE: the most common/frequently occuring score • ex (1,2,2,5,6,7,9) mode = 2
MEAN: the average score • ex (1,2,2,5,6,7,9) mean = 4,57
MEDIAN: the middle score (50. percentil) • ex (1,2,2,5,6,7,9) median = 5
Normer og standardisering
Data, der afspejler testresultater fra den generelle population eller en specifik gruppe, fx køn eller alder, og som gør det muligt at fortolke en persons testresultater i forhold til andre, dvs. afspejler hvor en person er placeret i forhold til andre. Kvaliteten af normerne afhænger af sample size, hvor repræsentativ samplet er (også i ekstremerne), hvor de er fra (fx hvilket land) og hvor gamle normerne er. Forskellige slags normer:
• Percentiler, baseret på en ordinal skala, ex en person med en score i den 75. percentil, har en højere score end 75 % af personerne i samplet
• Gennemsnit og standardafvigelser • Normaliserede scores, ex T-‐scores
Standardisering inkl.:
• Standard metoder for administration af testen • Standard metoder for scoring af testen • Udviklingen af normer for testen
Raw scores Det første resultat fra en psykologisk test, dvs. før scoren bliver normaliseret til fx T-‐scores.
Normalized scores (standard scores)
Normaliserede scores er råscores, der er omregnet til at passe i en normalfordeling og som kan fungere som en fælles standard skala for forskellige tests. z scores The most basic standard score, which can also be used to calculate other standard scores with. • Mean = 0 • SD = 1 T scores (ex SCL-‐90 og MMPI og NEO-‐PI) • Mean = 50 • SD = 10 • Usually ranges from 20-‐80 Sten scores • Mean = 5,5 • SD = 2 • Ranges from 1-‐10
Stanine • Mean = 5 • SD = 2 • Ranges from 1-‐9
Scales Nominal Folk klassificeres i kategorier ved brug af numre/tal. Kun få regne-‐operationer kan lade sig gøre. Kategorier er ikke ordnede på nogen måde og kan ikke sammenlignes kvantitativt.
• Ex mand = 1, kvinde = 2. Ordinal Folk sættes i rangorden med hensyn til en bestemt variabel, fx fra først til sidst eller lav til høj i en konkurrence. Disadv.: Skalaen viser ikke folks absolutte position, kun den relative ifht. de andre i sættet. Det er ikke muligt at udlede de reelle forskelle mellem to personer, kun deres relative placering på rangordenen.
• EX. LIKERT SKALA (A large number of favourable and unfavourable items are developed and administered to a large group of people who rate them on a continuum from 1-‐5, from like to dislike (sometimes 1-‐7).
Interval Numre knyttes til et individ og viser om denne er >, <, = andre, men repræsenterer også hvor meget forskel der er mellem individer. Manglen på et ægte nulpunkt betyder at man ikke kan vide/kende det absolutte niveau af det der måles på.
• Ex depressionsskala fra 1-‐10, celsiustemperaturer, IQ etc. Ratio Ligesom intervalskalaen, men med et naturligt/absolut nulpunkt. Den mest ideelle og meningsfulde skala.
• Ex meter, kilo, reaktionstid, antal rigtige svar etc.
I psykologiske tests benyttes ofte ordinal-‐ eller intervalskalaer.
Reliabilitet Reliabilitet reflekterer i hvilken grad en test giver det samme resultat for de samme personer hver gang de tager testen, dvs. hvorvidt testen måler en stabil egenskab/træk/faktor. Endvidere kan reliabilitet bruges til at vurdere hvorvidt individuelle forskelle i testresultater er forårsaget af reelle forskelle mellem individer eller tilfældige variationer, jf. Classic Test Theory hvor X (testscore) = T (true score) + E (error).
• Dvs. reliabilitetskoefficienten kan bruges til at beregne Standard Error of Measurement.
Measures of reliability
• Test-‐retest (Stabilitet af testresultater over tid, Correlation)
• Parallel Form (Correlation) • Split-‐half (Hvorvidt items måler på det samme træk, Correlation with
S-‐B prophecy) • Inter-‐item (Intern reliabilitet og item homgenitet, dvs. har forskellige
items samme egenskaber, Cronbach’s Alpha) • Scorer/Inter-‐rater (Kappa)
For individual test results, a reliability of > 0.7 is usually required. For research purposes, lower reliabilities can be accepted.
Spearman-‐Brown Prophecy
En formel, der relaterer en tests længde til reliabiliteten og som ofte bruges til at forudsige ændringer i reliabiliteten når der foretages ændringer af testlængden. (En tests reliabilitet bliver mindre når der er få items.)
• Relateret til split-‐half korrelationer og afhænger af testlængde (dvs. antal items).
Cronbach’s Alpha (internal reliability)
• Måler den interne reliabilitet, dvs. i hvor høj grad de forskellige items i en test eller bestemt skala måler på det samme træk/evne/faktor/construct.
• Gennemsnittet af alle mulige inter-‐item reliabiliteter, dvs. alle de måder, det er muligt at sammenligne forskellige items i en test.
• Cronbach’s Alpha bliver større når antallet af items i en test forøges, (ligesom split-‐half.)
Validitet Validitet handler om hvorvidt en test måler det den siger den måler, men også hvordan, hvornår, under hvilke omstændigheder og til hvilke formål en test kan give meningsfulde resultater. Types of validity: a. Content validity: Dækker testen alle aspekter af det område/træk/faktor den omhandler? Er der irrelevante items? Content validity er en egenskab ved selve testen og hvor godt den er blevet lavet. Bestemmes ved at fx eksperter eller testbrugere mener at testen dækker det område den måler godt nok. b. Criterion-‐related validity: Hvorvidt testresultater kan vises at være relateret til et eksternt kriterium, fx hvor godt man klarer sig på en anden test, ens jobkompetencer, demensniveau etc. Nyttigt ifht. at evaluere hvorvidt en test kan bruges til at lave forudsigelser. Er som regel kvantitativ, dvs. udtrykt ved et tal, fx korrelationer og regression når kriteriet er variabelt (demensniveau) og udtrykt ved sensitivitet/specificitet når der gøres brug af cut-‐off score og kriteriet er enten-‐eller (skizofren eller ej).
• Predictive validity: Bruges til at forudsige fremtidig adfærd (fx resultater i en anden test.) Af stor betydning i erhvervssammenhænge, hvor testresultater bruges til at forudsige fremtidig jobpræstation.
• Concurrent validity: Sammenligner en test med andre eksisterende vurderinger, fx resultater i en erhvervstest sammenlignes med nylige vurderinger af ens jobpræstation. Afhænger stærkt af kvaliteten af de vurderinger testen
sammenlignes med. c. Construct validity: Den vigtigste type validitet, der omfavner og overlapper med de andre former for validitet. Har at gøre med evidens for at testen virkelig måler på det område/træk/faktor den påstår. Opbygges gennem længere tids evidensindsamling og afprøvning af testen indenfor forskellige områder (modsat predictive validity der kan etableres i et enkelt eksperiment.)
o Convergent-‐discriminant/divergent validity: Når en tests resultater konvergerer/korrelerer med andre tests, der måler på det samme område/træk/faktor og divergerer/afviger fra tests, der måler på noget helt andet. Måles ofte via Multitrait-‐multimethod metoden.
Other validities
• Face validity – does the test look appropriate to the person being tested.
• Faith validity – does the tester believe in the test. • Ecological validity -‐
The Multitrait-‐multimethod Approach
En detaljeret måde at etablere convergent-‐discriminant validity, ved at korrelere/sammenligne forskellige assessment metoder af forskellige områder/træk/faktorer i en matrix. Herved kan evidens for convergent validity etableres ved at observere om der fremkommer ensartede målinger når man måler et område/træk/faktor med forskellige metoder/testinstrumenter.
Sensitivitet og specificitet
• Sensitivitet – Hvor ofte identificerer testen patienttypen (fx dement.) Lav sensitivitet medfører flere falsk negative.
• Specificitet – Hvor ofte identificerer testen ikke-‐patienten (fx ikke-‐dement.) Lav specificitet medfører flere falsk positive.
Cut-‐off score, sensitivitet og specificitet Sensitivitet kan på bekostning af specificiteten forøges ved at justere på cut-‐off scoren og omvendt.
Standard Error of Measurement
Kan bruges hvis reliabilitet er kendt SEm = SD x √(1-‐r)
• It indicates how much ’inaccuracy’ there is in a test score because of its less-‐than-‐perfect reliability
Classic test theory defines reliability as: The proportion of variance which is not caused by random error due to Observed score = True score +/-‐ Measurement error
SEm = SD x √(1-‐r)
Ex: The Wechsler Intelligence Scale, SD = 15, r = 0,9
SEm = 15 x √(1-‐0,9) = 4,65 Which means that there is a 68 % possibility (1 SD on both sides of the score) that a person’s ”true score” lies within an interval of +/-‐ 5 points around the person’s obtained score and 95 % possibility (2 SD’s on both sides of score) of the true score lying within an interval of +/-‐ 10 points around the obtained score. (= Confidence limits/intervals)
Konfidensinterval Baseret på SEm. Ex: The Wechsler Intelligence Scale, SD = 15, r = 0,9 SEm = 15 x √(1-‐0,9) = 4,65 Which means that there is a 68 % possibility (1 SD on both sides of the score) that a person’s ”true score” lies within an interval of +/-‐ 5 points around the person’s obtained score and 95 % possibility (2 SD’s on both sides of score) of the true score lying within an interval of +/-‐ 10 points around the obtained score. (= Confidence limits/intervals)
Criterion-‐keying Criterion-‐keyed construction selects items based on their discriminability between a criterion group and a control. Item content is irrelevant and the approach is atheoretical. EX MMPI.
• From a large item-‐pool with various items, those items which can discriminate between a criterion group and a control are selected (regardless of their content).
• A purely empirical approach, not based on theory. (A limitation) • ex MMPI (Minnesota Multiphasic Personality Inventory), CPI
(California Personality Inventory) • Based upon true-‐false or yes-‐no items.
Limitations:
• No understanding of why such a test works due to the lack of theory behind item selection.
• Likely to produce scales of poor reliability, as scales would be measuring a mix of different possible attitudes.
• The test would be specific only to the group used in construction.
Classic test theory In construction using classic test theory the aim is to generate a pool of items of various difficulty measuring the same thing (item homogeneity), which is checked by item/total correlations.
• Observed score = True score +/-‐ Measurement error • A test is generated by analysing items, ex factor analysis, in a pilot
sample to make sure they measure only one factor. The end result is a pool of items of differing difficulty which measure only one thing.
• This is done by item-‐whole/total correlation (correlating every single item with total scores) and item analysis.
Classic test theory In construction using classic test theory the aim is to generate a pool of items of various difficulty measuring the same thing (item homogeneity), which is checked by item/total correlations.
• Observed score = True score +/-‐ Measurement error • A test is generated by analysing items, ex factor analysis, in a pilot
sample to make sure they measure only one factor. The end result is a pool of items of differing difficulty which measure only one thing.
• This is done by item-‐whole/total correlation (correlating every single item with total scores) and item analysis.
• Two criterions for items: item homogeneity (they all measure the same thing) and difficulty indicator, ex item discrimination analysis.
IRT (Item Response Theory) & RASCH models
Using IRT and Rasch scaling items first undergoe factor analysis to ensure they all measure the same trait and are then analysed with focus on item difficulty and how well the item facilitates elicitation of the wanted trait.
• A very large sample is needed. • Items are constructed and an initial factor analysis is carried out to
make sure that all items measure a single trait. • In order to computate the Rasch parameters the sample data is split
into groups of high and low scorers, providing different levels of the difficulty level.
• If the facility of an item for eliciting the trait is the same for both groups, it is seen as conforming to the model and is chosen for the test.
• On completion of item selection an item-‐free measurement of all individuals is needed, ex by checking whether sub-‐groups of items (the hardest vs. the easiest) generate the same scores for each person. If items fit the model, each individual will score the same on both tests.
• EX ITEM CHARACTERISTICS CURVES
Faktoranalyse Factor analysis is based on analysis of correlations between variables, and is used to check whether items relate to the same trait/theme/factor. Oblique or orthogonal analysis.
• A multivariate data reduction tool which enables us to simplify correlations between sets of variables.
• Based upon correlations between variables. In test construction factor analysis is used to check whether items relate to a common theme or factor.
• Used to generate assessments which only measure one factor. With oblique analysis factors are correlated. With orthogonal analysis factors are not related by correlation, ie. factors are independent.
• ex 16PF, Eysenck Personality Inventory. Limitations:
• Needs large samples. • Complex technical problems, therefore good knowledge of its
procedures is highly important.
Item Discrimination Analysis
Testing that the people who get any one item right have got more of all the remaining items right than those who got the item wrong.
KOGNITIVE TESTS
WAIS-‐IV (Article by Weiss L. G. et al. (2009)) FACTS:
• Gennemsnit = 100, SD = 15, SEm = 5 • Alder 16 -‐ 90 (89 for WAIS-‐III) • 15 subtests (5 of which are supplemental), 4 index scales • Tager ca. 67 min. at gennemføre
PURPOSE: WAIS er en kompleks prøve til måling og vurdering af begavelse. THEORY: Wechslers definition af intelligens inkl. aspekter fra både Spearman (den overordnede g-‐faktor) og Thorndike (kvalitativt forskellige evner.)
Dvs. WAIS bygger på en antagelse om intelligens som en global størrelse, der omfatter en række forskellige indbyrdes uafhængige funktioner (inkl. kognitive og ikke-‐kognitive evner, fx motivation, vedholdenhed, temperament), som tilsammen giver et grundlag for intelligensvurdering.
DEVELOPMENT AND REVISIONS:
• 4 delprøver udgår: Object Assembly, Picture Arrangement, Coding Recall og Coding Copy
• Nye delprøver: Visual Puzzles (i stedet for Picture Completion), Figure Weights og Cancellation
Psykometriske forbedringer: • Updaterede normer • Udvidelse af FSIQ range • Forbedring af floors og ceilings • Forbedret brugervenlighed, fx reduceret testing tid (fra 80 67 min.) • Reviderede instruktioner
Fra Verbal IQ og Performance IQ til Indeksscores (VCI, PRI, WMI og PSI) Den vigtigste forskel mellem WAIS-‐III og WAIS-‐IV er at indeksscores er blevet det primære niveau for fortolkning af resultater i stedet for Verbal og Performance IQ, der var baseret på de gamle Army Alpha og Army Beta. Der lægges dermed mere vægt på Verbal Conceptualisation, Perceptual Reasoning, Working Memory og Processing som led i en mere differentieret beskrivelse af intelligenstestresultaterne og i højere overensstemmelse med nuværende viden om kognition.
Wechslers intelligens-‐tests
WISC WAIS WPPSC WIAT
SCALES AND ITEMS: 15 subtests (5 of which are supplemental), 4 index scales Verbal Comprehension Index:
• Similarities • Vocabulary • Comprehension • Information • Word Reasoning
Perceptual Reasoning Index:
SCALES AND ITEMS: 15 subtests (5 of which are supplemental), 4 index scales Verbal Comprehension Index:
• Similarities • Vocabulary • Comprehension • Information • Word Reasoning
Perceptual Reasoning Index: • Block Design • Picture Concepts • Matrix Reasoning • Picture Completion
Working Memory Index: • Digit Span • Letter-‐Number Sequencing • Arithmetic
Processing Speed Index: • Coding • Symbol Search • Cancellation
NORMS AND STANDARDISATION:
• Dansk standardisering: 340 personer i aldersgruppen 17-‐-‐70 år. • Amerikansk standardisering: 2450 personer, stratificeret ud fra køn,
alder, uddannelse, sociodemografiske data. Dækker aldersgruppen 16-‐-‐89.
VALIDITY AND RELIABILITY (WAIS-‐III): Reliabilitet
• Reliabilitet utrolig god, testet ved split-‐half og test-‐retest. • Reliabilitet af delprøver (mellem 0,7-‐0,8, med undtagelse af ordforråd og
information på 0,9) er lavere end FSIQ reliabilitet som er blevet udregnet til 0,97.
Validitet
• God til utrolig god convergent-‐discriminant validity coefficients (construct validity), målt ved korrelation mellem VCI/PRI og andre Wechsler tests (ex VIQ og PIQ på WISC-‐III).
• God kriterierelateret validitet fx i forbindelse med andre Wechsler tests og andre IQ tests.
SCORING: Primary scores
Subtest score, gennemsnit = 10 og SD = 3 Index score og FSIQ, gennemsnit = 100 og SD = 15
Kliniske forskelle mellem indeksscores
• En forskel på 12 point eller mere significerer klinisk betydning (14 point for PSI)
Composite scores
• General Ability Index (GAI): VCI + PRI, repræsenter de mest g-‐loadede subtests. • Cognitive Proficiency Index (CPI): WMI + PSI, repræsenterer effektiv
bearbejdning gennem høj visuel hastighed og god mental kontrol, der begge
INTERPRETATION: Verbal Comprehension Index (Vocabulary, Information, Similarities, Comprehension)
• Evne til at forstå verbale stimuli, arbejde med semantisk materiale og kommunikere tanker og ideer med ord.
• Krystalliseret viden. • Afhænger delvist af en person’s uddannelsesniveau og generelle
livserfaring, men også evne til at forstå indlært viden og benytte det på passende vis.
Vocabulary subtesten har den højeste ”g” load og er den bedste indikator af overordnet intelligens. Perceptual Reasoning Index (Block Design, Matrix Reasoning, Visual Puzzles, Figure Weights, Picture Completion)
• Måler flydende intelligens, samt perceptuel organisering (da fluid reasoning ikke kan måles separat, men kræver et objekt.)
• Kvantitativ, ikke-‐verbal flydende intelligens og evnen til at bevare et visuelt billede i tankerne, mens man mentalt manipulere det.
Working Memory Index (Digit Span, Arithmetic, Letter-‐Number Sequencing)
• Måler på opmærksomhed, koncentration og working memory (Baddeleys model), dvs. evnen til mental kontrol, at holde information i tankerne (kortvarigt), mens man udfører en form for mental manipulation på denne information.
• Vær opmærksom på, at forskelle på scores i fx Digit-‐Symbol eller Letter-‐Number Sequencing og Arithmetic kan afspejle at testpersonen ikke har lært det nødvendige matematik frem for at han/hun har specifikke indlæringsvanskeligheder.
Processing Speed Index (Coding, Symbol Search, Cancellation)
• Måler på hastigheden af mental bearbejdning ved hjælp af visuelle og graphomotor evner, og er relateret til effektiv brug af andre kognitive evner.
• PSI interagerer med andre højere rangerende kognitive funktioner og kan have betydning for generelle kognitive funktioner, ny indlæring, ræsonnering og hverdagspræstationer.
• Vær opmærksom på at PSI er tæt relateret til alder, dvs. svækkes med alderen.
OTHER REMARKS: A BRIEF HISTORY OF INTELLIGENCE TEST INTERPRETATION
1. The first wave: Quantification of general level (Focus: The global IQ and practical considerations regarding the need to classify people into separate groups. Ex Stanford-‐Binet Scale and Spearman’s g-‐factor.)
2. The second wave: Clinical profile analysis (Focus: Patterns of high and low subtest scores, which could presumably reveal diagnostic and
psychotherapeutic considerations.) 3. The third wave: Psychometric profile analysis (Focus: Psychometric precision
and methods in profile analysis, rather than the loose interpretative attempts of clinical profile analysis. However, the lack of empirical support and a theoretical background makes this approach controversial and lacking in validity.)
4. The fourth wave: Application of theory (Focus: Grounding intelligence testing and interpretation of scores on a theoretical basis. The most popular theory in test development and interpretation is the CHC (Cattell-‐Horn-‐Carroll theory).
Fluid and crystallized intelligence (Cattell)
• Flydende intelligens er evnen til logisk tænkning og problemløsning i ukendte situationer, uafhængigt at indlært viden. Evnen til at analysere nye problemer, identificere mønstre og sammenhænge.
• Krystalliseret intelligens er evnen til at bruge skills, viden og erfaring. Er ikke det samme som hukommelse, men er afhængig af adgangen til langtidshukommelsen. Består af ens livstid og intellektuelle opnåelser, fx vist gennem ordforråd eller general viden om verdensbegivenheder.
WISC-‐IV (Article by Flanagan, D. P. & Kaufman, A. S. (2009)) DEVELOPMENT AND REVISIONS: Structural changes from WISC-‐III to WISC-‐IV:
• Deleted subtests = Picture Arrangement, Object Assembly and Mazes. • New subtests = Word Reasoning, Matrix Reasoning, Picture Concepts, Letter-‐
Number Sequencing and Cancellation. • VIQ and PIQ dropped and replaced by 4 indexes: WMI, PSI, VCI and PRI.
o WHY? Because the difference/discrepancy between the 2 was overused, and its meaningfulness and clinical utility was never made clear in the litterature.
• FSIQ has changed dramatically in content and concept and now consists of merely 5 (out of 10) subtests (Similarities, Comprehension, Vocabulary, Block Design and Coding).
• Norms updated. • Items added to improve floors and ceilings.
SCALES AND ITEMS: WISC-‐IV consists of 15 subtests – 10 core-‐battery subtests and 5 supplemental subtests. G-‐loadings VCI subtests generally have the highest g-‐loadings at every age, followed by the PRI, WMI and PSI subtests, except Arithmetics which loads more like VCI. STANDARDISATION: Sample = 2200 children resembling the 2002 Census data on variables of age, gender, geographic region, ethnicity and socioeconomic status. The sample was divided into 11 age groups, each containing 200 children and was split equally between boys and girls.
RELIABILITY:
Average internal consistency: • of subtests: Ranges from 0,72 (Coding, for ages 6-‐7) to 0,94 (Vocabulary, for
age 15). • of indexes: VCI = 0,94, PRI = 0,92, WMI = 0,92 and PSI = 0,88. • of Full Scale IQ: FSIQ = 0,97.
Average test-‐retest coefficients: • VCI = 0,93, PRI = 0,89, WMI = 0,89, PSI = 0,86 and FSIQ = 0,93.
Practice effects: • In general practice effects are greatest for ages 6-‐7 and become smaller with
increasing age. Coding and Symbol Search showed the largest gains (ages 6-‐7). Floors and ceilings for all WISC-‐IV subtests are excellent, which means that WISC-‐IV can be used with confidence in testing individuals who are functioning either in the gifted or mentally retarded ranges of functioning. Item gradients refer to the spacing between items on a subtest. Generally these range from good to excellent at all ages in the WISC-‐IV. This means that the spacing between items is generally small enough to allow for reliable discrimination between individuals on the latent trait measured by the subtest. VALIDITY:
• Structural validity is supported by factor-‐analytic studies. • Positive results of investigations (Keith et al., 2006) of whether the WISC-‐
IV measures the same constructs across its 11-‐year age span (children from 6-‐16).
• The nature of these constructs was also investigated and it was concluded that the WISC-‐IV measures Crystallized Ability, Visual Processing, Fluid Reasoning, Short-‐Term Memory and Processing speed.
• Good to excellent convergent-‐discriminant validity when considering VCI and PRI.
OTHER REMARKS: • Ipsative/intraindividual interpretations: Interpretation/analysis of an
individual’s profile.
SPM+ / MHV FACTS: Brief nonverbal (SPM+) and verbal (MHV) screening measures of general ability.
• For use in educational and clinical settings • Group or individual administration • Allows for comparison with peers • SPM+ and MHV can be administered together or on their own
Raven’s Progressive Matrices (Article by Raven)
PURPOSE: Udviklet til at vurdere aspekter af g-‐faktoren (som beskrevet af Spearman, 1927) og de to underkomponenter, eductive og reproductive ability.
THEORY: Baseret på teorien om eductive og reproductive intelligens, hvilket svarer overens til flydende og krystalliseret intelligens (ex SPM måler på den flydende/eductive intelligens, MHV måler på den krystalliserede/reproductive intelligens.) DEVELOPMENT: Raven’s Progressive Matrices har været i brug i mere end 70 år. De første serier var baseret på en test brugt af Spearman. 1938 – Standard Progressive Matrices (SPM) 1941 – Advanced Progressive Matrices (APM) sværeste udgave, til de klogeste 20% 1947 – Coloured Progressive Matrices (CPM) til børn 5-‐10 år The SPM+ Udviklet for at imødekomme Flynn effekten, der var meget udtalt ifht. Raven’s, producerede fx nogle stærke ceiling effects. Den nye version af SPM havde derfor fået fjernet de nemmeste items og tilføjet nogle sværere. SCALES AND ITEMS: SPM+ Består af 5 sæt á 12 multiple-‐choice problemer/items (ikke-‐verbale stimuli fx visuelle mønstre og former) arrangeret i et cyklisk format, dvs. hvert sæt starter med forholdvis nemt og åbenlyst problem og forsætter med sværere og sværere items. Hvis administreret efter standard procedure indeholder Raven’s således et indbygget træningsprogram og vurderer også testpersonens evne til at lære af erfaring. RASCH modeller: Raven’s passer ikke umiddelbart godt på Rasch modeller, da folk kan gætte sig til det rigtige svar (pga. multiple-‐choice strukturen.) MHV Består af 2 sæt af i alt 88 ord, der skal defineres af testpersonen. I det ene sæt skal prøvepersonen selv skrive ordbeskrivelserne, mens det andet sæt er multiple-‐choice.
NORMS AND STANDARDISATION:
o Gode normer fra 924 børn, 7-‐18 år gamle, 2008 The SPM+ has been standardised numerous times on different populations. The majority of standardisations have been completed on the Classic Form, from which the SPM+ has been developed.
RELIABILITY:
SPM+ reliability Split-‐half reliability: r = 0,936, n = 924 Test-‐retest reliability: r = 0,833, n = 105 SPM+ Standard Error of Measurement and confidence intervals SEM = 3,79 (standardised scores) 95 % confidence interval ≈ 7 (standardised scores)
MHV reliability Test-‐retest reliability = 0,916 Parallel forms reliability = 0,929. MHV Standard Error of Measurement and confidence intervals SEM = 3,99 95 % confidence interval ≈ 8 (Evidence of SPM+ validity also comes from SPM-‐C data as the two are similar in form and content.) IN SUM:
• GOD RELIABILITET for SPM (split-‐half og test-‐retest) og MHV (test-‐retest og parallel forms).
• Raven’s scores kan omdannes til IQ scores. Her har de en SEM (Standard Error of Measruement) ca. 4 point for både SPM og MHV.
VALIDITY: Content validity
• Item analysis shows that the properties of SPM+ are relatively stable. • SPM-‐C has face validity in cross-‐cultural settings, ie. its form is not
culturally biased. Criterion-‐related validity
• Concurrent validation between SPM+ and SPM-‐C/SPM-‐P shows a pooled correlation between 0,8 and 0,83.
• In general, concurrent and predictive validity of the SPM-‐C varies with age, possibly sex, homogeneity of the sample etc.
• Reliable correlations between SPM-‐C and Stanford-‐Binet and Wechsler-‐scales.
• Correlations between the SPM-‐C and performance on achievement and scholastic aptitude tests have generally been lower and more variable than correlations with intelligence tests.
• Lower correlations/concurrent validity between SPM-‐C and measures of verbal and language abilities than with measures of maths and science skills.
Construct validity
• Evidence of age-‐related validity as raw scores increase regularly as children get older and older.
• Evidence from Item Characteristic Curves shows that the items are all measuring a common factor and that the abilities required to solve the problems form part of a continuum, ie. it is generally not possible to solve the more difficult problems if one does not have the abilities to solve the easier problems (ex of a Rasch model.)
• Raven’s is generally described as one of the best measures of g and fluid intelligence (ex factor analysis and cross-‐cultural studies.)
However some factor analytic studies also suggest that Raven’s measures other factors in addition to g. There has especially been evidence of a spatial component. IN SUM: Content validity
• Item analysis shows that properties of SPM+ are relatively stable • Face validity in cross-‐cultural settings
Criterion validity
• Correlates highly with the Stanford-‐Binet & Wechsler .54-‐.86 • Correlations with achievement, scholastic, occupational measures (lower than
correlations with intelligence tests) Construct validity
• Age-‐related validity • Evidence from Item Characteristic Curves shows that the items are all
measuring a common factor and the abilities required to solve the problems form part of a continuum.
• Described as one of the best measures of g and fluid intelligence, with evidence from factor analytic studies and cross-‐cultural studies, all revealing high g loadings
ADMINISTRATION: SCORING: INTERPRETATION:
OTHER REMARKS: The Flynn Effect Large rises in mean scores since initial publication (also seen in other psychometric tests, ex WISC). Much work on the Flynn Effect has come from analysis of Raven’s Progressive Matrices. Flynn showed that on average, IQ scores increased by 0,3 IC points every year and had been doing so throughout most of the 20th century. He argues that the rise is due to the increasing influence of scientific ways of thinking. The Flynn Effect appears to be universal with similar results being reported in over 14 countries.
ERHVERVSTESTS 360° Feedback (Brett & Atwater, MRG)
FACTS: A process in which subordinates, peers and bosses provide anonymous feedback to managers, who also rate their own performance. The LEA was not designed to be used as:
• a measure of personality • a basis for termination • a direct measure of manager/leader performance
PURPOSE: The profile provides information about the focus person’s views on his own leadership role and his boss’, colleagues’ and employees' perception of the leader’s behavior -‐ all to increase the organizational effectiveness of the individual leader and the individual management team. THEORY: Is supposed to provide developmental feedback, which can improve performance by creating awareness and motivating individuals to change behaviour eg. if ratings from others are lower than self ratings. Leadership "sets" = the theoretical basis for the LEA 360.
• Def.: A "set" indicates the probability that a leader will behave consistently across a broad range of managerial challenges.
DEVELOPMENT:
• Baseret på empiriske studier af ledere og lederadfærd. • Rollebaseret adfærdsanalyse. • Leadership ”sets” = den teoretiske basis for LEA 360. • Based on empirical studies of leaders and leadership behavior. • A role-‐based behavioral analysis. • Originally 35 leadership sets -‐ was reduced to the current 22 sets.
SCALES AND ITEMS: The profile is a web-‐based analysis of behavior that are conducted among the manager himself, his boss, peers and direct reports. It is measured and analyzed in the following main areas:
-‐ Creating a vision -‐ Developing followers -‐ Implementing the vision -‐ Following through -‐ Achieving results -‐ Team playing
VALIDITY AND RELIABILITY: Test-‐retest:
• Gennemsnitlig test-‐retest reliabilitet på 0,77-‐0,8. Inter-‐rater: (Extensive inter-‐rater reliability studies using the ratings of 1068 bosses, 2592 peers and 2544 direct reports. Intra-‐class correlation coefficients were used to assess inter-‐rater reliability.)
• Boss ratings: coefficients ranged from 0,58 (2 raters) to 0,80 (4 raters). • Peer ratings: coefficients ranged from 0,67 (4 raters) to 0,80 (8 raters). • Direct report ratings: coefficients ranged from 0,66 (4 raters) to 0,79 (8 raters).
Internal consistency measures were not conducted, as they are not appropriate for the (semi-‐ipsative) format used in LEA.
IN SUM:
• Test-‐retest reliabilitet på 0,77-‐0,80 • Inter-‐rater boss (0,58-‐0,80), peers (0,67-‐0,80), direct report (0,66-‐0,79) • Ingen internal consistency målinger, da disse ikke er relevante for LEA 360
ADMINISTRATION:
• The leader gets evaluation from him-‐/herself, his/her nearest boss, selected peers (min. 2) and selected direct reports (min. 2)
• The profile does not tell whether the observers like or dislike the leader’s style, but how the leader is perceived to use his/her energy and administers his/her leadership.
• In consultation with the consultant, advantages and disadvantages in the profile are discussed, in relation to the leader’s current challenges.
OTHER REMARKS: • Ifølge Brett & Atwater er 360 mest nyttig når fokuspersonen
modtager høje ratings/feedback, der bekræfter deres egne (høje) selv-‐ratings.
Myers-‐Briggs Type Indicator
(personlighedstest baseret på Jungs typologi)
NEUROPSYKOLOGISKE TESTS FACTS:
• En del af CAMDEX • Indeholder også MMSE (Mini Mental Status Examination) • Total Score ranges from 0 to 107 • 7/8 subscales, e.g. Memory and Attention • Relatively free of floor and ceiling effects
PURPOSE: En kort og præcis neuropsykologisk test til vurdering af kognitiv svækkelse hos ældre. Specielt konstrueret til tidlig diagnosticering af demens.
CAMCOG
THEORY & DEVELOPMENT: Opgaverne i CAMCOG har som formål at undersøge kognitive områder som er inkluderet i operationelle diagnostiske kriterier (fx ifølge DSM-‐IV og ICD-‐10), dvs. orientering, sprog, hukommelse, opmærksomhed, praksi, abstrakt tænkning og perception. Alle MMSE-‐opgaver er inkluderet i CAMCOG, men ikke alle opgaver bruges til at
udregne CAMCOG-‐scoren. SCALES AND ITEMS:
• Orientering (tid og sted) • Sprog (forståelse, motorisk og verbal reaktion, læsning, ekspressivt,
benævnelse, definitioner, gentagelse, diktat) • Hukommelse (korttids, langtids, episodisk, semantisk, ny indlæring,
genkendelse etc.) • Opmærksomhed og regning (100-‐7, tælle bagfra, udregning)perception, • Praksi (kopiering, tegning, handle ifølge opfordring) • Abstrakt tænkning (ligheder) • Perception (taktil genkendelse (er ikke inkl. i CAMCOG-‐R), visuel genkendelse,
usædvanlige vinkler, genkende personer) NORMS AND STANDARDISATION: Odense normer: 217 personer, alder 65-‐89, tilfældigt udtrukket fra CPR. VALIDITY AND RELIABILITY:
• Test-‐retest (2.5 år) r = .78
• Camcog-‐Age r = -‐.47, dvs. CAMCOG scorer falder med alderen. • Camcog-‐DART r = .59 , dvs. jo højere CAMCOG score desto højere DART
score. • Age-‐DART r = -‐.19, ingen klinisk betydning. • Uddannelsesniveau var uden signifikant relation ifht. variansen i CAMCOG
sumscore (når der var taget højde for forskellene i DART.) • Specielt personer i laveste socialgruppe fik lavere CAMCOG scores.
Sensitivitet og specificitet
• Ved en cut-‐off score på ≥ 88 var sens. 85,2 % og spec. 72,4 % • Ved en cut-‐off score på ≥ 85 var sens. 74,1 % og spec. 82 %
ADMINISTRATION & SCORING: Højest mulige totalscore = 105. Det er endvidere muligt at udregne en delscore for hypotetisk dissociable funktioner og at udregne en separat eksekutiv funktionsscore. Det tager ca. 20 minutter at gennemføre CAMCOG.
PERSONLIGHEDSTESTS FACTS:
• 240 items i form af udsagn, der af pp. skal vurderes på en likert-‐lignende fempunktskala: Meget uenig over neutral til meget enig
• Other versions inkl. a shorter version, the NEO-‐FFI (NEO Five-‐Factor-‐Inventory) consists of 60 items, that assess only the five factors and NEO-‐PI-‐3 where hard to read items have been removed and therefor can be used down to ages 12.
NEO-‐PI
PURPOSE:
• Undersøger 5 brede personlighedsdimensioner eller domæner, med hver 6 underliggende træk eller facetter, således at der scores i alt 30 personlighedstræk
• Måler træk og stabile egenskaber (trait) ved den normale personlighed og ikke psykopatologiske tilstande (state).
THEORY:
• The Big Five/Fem-‐faktor-‐modellen (den nuværende mest dominerende, empirisk-‐baserede personlighedsmodel)
DEVELOPMENT: • Not based on any single theory. • The selection of traits was based on literature reviews of personality litterature
as a whole. At first only 3 factors (NEO) were identified, but after working with the natural (layman) language of personality traits, two more were added.
• Later the five factors were related to Murray’s needs, Jung’s types, Gough’s folk concepts etc. and were thereby grounded in theory.
• Scales were developed using a combination of rational and factor analytic methods.
SCALES AND ITEMS:
• Neuroticism (N) • Extraversion (E) • Openness to experience (O) • Agreeableness (A) • Conscientiousness (C)
Each scale consists of 6 subtraits assessed with 8 items each = 240 items in total.
OBS! No validity scales to detect lying, defensiveness or faking bad. VALIDITY AND RELIABILITY: Reliability
-‐ High Internal consistencies/coefficient alphas for the 5 factors, ranges from 0,88 to 0,92
-‐ Coefficient alphas for the 8-‐item facet scales are understandably lower, from 0,51 to 0,86.
-‐ High test-‐retest (2 weeks) reliabilities of 0,86 to 0,90 for NEO-‐FFI scales and high test-‐retest (2 years) reliabilities for factors (N, E, O, A, C) ranging from 0,83 to 0,91.
-‐ Cross-‐observer agreement (the correlation between self-‐report and observer) is usually in the range of 0,4 to 0,6.
Validity
-‐ Meaningful correlations with MMPI, MCMI (and PAI, BPA) -‐ Predictive validity: useful in prediciting work interests, ego development,
attachment styles and psychiatric diagnoses of personality disorder.
ADMINISTRATION:
• Individually or in groups • Ages 18 + (with NEO-‐PI-‐3 hard to read items have been removed and
can therefor be used down to ages 12)
SCORING: • Objective scoring • The NEO-‐PI is not scored if 40+ questions are unanswered or if strings of
repetitive responses are noted (ex 6 consecutive strongly disagree). FACTS:
• A 550 item (agree, don’t know, disagree) self-‐report inventory. • Age range 18 + • Reading level minimum 9th grade • Developed by Hathaway and McKinley • MMPI (1942), restandardised to MMPI-‐2 (1982). • Administration time: 1-‐2 hours
Other versions: • MMPI-‐RF (Restructured Form), 338 items (none new), can be scored from
MMPI-‐2, has kept K and L validity scales and incl. 50 restructured clinical scales.
PURPOSE: The test measures psychopathology and normal/abnormal personality functioning. Created for assessing clinical patients and evaluating effects of therapy. THEORY: None. DEVELOPMENT:
• Udviklet ved brug af metoden ”contrastint groups”, dvs empirical criterion keying
• Comparing 8 clinical groups (n=50) with ’normals’ (n=750)
MMPI
SCALES AND ITEMS: Items = 550 items divided into 25 content areas related to general medical and neurological symptoms, political and social attitudes, affective and cognitive symptoms, fears and obsessions, family, educational and occupational experience, masculinity-‐femininity and items revealing an overly virtuous self-‐presentation on the inventory (a kind of validity scale). Validity scales:
• L (Lie) = Assesses attempts to place oneself in af morally favorable light by denying moral imperfections.
• F (Infrequency) = Assesses the tendency to claim highly unusual attitudes or behaviours related to severe psychopathology. Person trying to place him/herself in an unfavorable light.
• K (Correction) = Assesses tendency to control and limit reporting distress, discomfort and problems relating to other people. (Fractions of K are added to
SCALES AND ITEMS: Items = 550 items divided into 25 content areas related to general medical and neurological symptoms, political and social attitudes, affective and cognitive symptoms, fears and obsessions, family, educational and occupational experience, masculinity-‐femininity and items revealing an overly virtuous self-‐presentation on the inventory (a kind of validity scale). Validity scales:
• L (Lie) = Assesses attempts to place oneself in af morally favorable light by denying moral imperfections.
• F (Infrequency) = Assesses the tendency to claim highly unusual attitudes or behaviours related to severe psychopathology. Person trying to place him/herself in an unfavorable light.
• K (Correction) = Assesses tendency to control and limit reporting distress, discomfort and problems relating to other people. (Fractions of K are added to certain other scales to discourage false positive or negative scores).
• ? (Cannot say) = Max. 10 ? answers. Reflects indicisiveness, noncompliance, dyslexia?
Other scales
• VRIN (Variable Response Inconsistency) = contradictory responses • TRIN (True Response Inconsistency) = everything is true.
De kliniske skalaer: 8 basic scales developed from Hathaway’s pathological criterion groups and 2 other scales, the Mf scales and the Social Introversion (Si) scale.
1. Hypokondri (K korrektion) 2. Depression 3. Hysteri (somatisering) 4. Psykopati (K korrektion) 5. Maskulin-‐feminin (udledt fra normalbefolkningen) 6. Paranoia 7. Psykasteni (mental svækkelse) (K korrektion) 8. Skizofreni (K korrektion) 9. Hypomani (K korrektion) 10. Social Introversion (udledt fra normalbefolkn.)
NORMS AND STANDARDISATION: Norms The Minnesota normals, groups of different psychiatric patients (used to develop scales) and a later restandardisation sample (2600 people) conforming to US census data , except when considering educational level and occupational attainment. Hypernormal bias The repeated use of the Minnesota normals as contrasts for the pathological criterion groups deprived those groups, in a statistical sense, of their ”normal” levels of pathology. This created a hypernormal bias, meaning that newly collected normals consistently scored higher than the Minnesota normals. Danske normer
• Danish translation & norms (523 of 2000 from CPR)
VALIDITY AND RELIABILITY: -‐ Objective scoring and high scorer reliability. -‐ Temporal stability/test-‐retest reliability (0,58-‐0,92) of the standard clinical
scales reflects both continuity and change in symptoms and personality.
SCORING: Raw scores are converted into T-‐scores.
MMPI profile code After T-‐scores have been plotted on the main MMPI-‐2 profile form, the profile can be represented in the form of a numerical code. This is done by recording the scale numbers (ex 1 for Scale 1) in descending order followed by the punctuation of this series of numbers by symbols indicating ranges of elevation. Same procedure for the validity scales F, K and L. Scores falling within a single T-‐score of one another are underlined. Factor scales
• First factor: incl. general maladjustment and distress • Second factor: emotional-‐behavioural control
INTERPRETATION: T-‐score > 65 = klinisk signifikant
og repræsenterer skellet mellem normal og patologisk.
• Test-‐taking attitude • Factor structure • Pattern vs. Content • MMPI-‐2 Structural Summary: An organization of all MMPI-‐2 scales on
basis of their content. • Test-‐taking Attitudes • Factor Scales • Moods • Cognitions • Interpersonal Relations • Other Problem Areas
FACTS:
• 175 false/true items self-‐report inventory. • Age range = 18 + • Administration time 20-‐30 min • Reading level minimum 8th grade • Developed by Millon. MCMI (1981), MCMI-‐II (1987), MCMI-‐III (1994). • NOT to be used with normals • Uses Base-‐Rate scores (not ex T-‐scores) • Version III was constructed to reflect the diagnostic criteria of DSM-‐
IV (1994)
PURPOSE: For use with adults who are being evaluated and/or treated in mental health settings – should NOT be used with ”normal” individuals. Instrument for assessment of personality disorders and major clinical syndromes, as described by Millon (but later also in accordance with DSM categories.)
MCMI (Article by Craig)
THEORY:
Millon’s egen bioevolutionære model over personligheds-‐ og psykopatologisk udvikling. DEVELOPMENT: Based on theory. Development guided by Loevinger’s 3 steps of test development and validation:
• Theoretical – Substantive • Internal – Structural • External – Criterion
SCALES:
• 12-‐24 ITEMS I HVER SKALA • FLERE ITEMS OVERLAPPER FORSKELLIGE SKALAER • ITEMS ER PROTOTYPIC ELLER NON-‐PROTOTYPIC (VÆGTES MED 2 ELLER 1
POINT VED SCORING) • SKALAER KOORDINERET MED DSM SYSTEMET
Validity Index (Detects random responding and confusement) Modifying Indexes
X. Disclosure (Measures willingness to admit to symptoms) Y. Desirability (Measures faking good) Z. Debasement (Measure faking bad)
Clinical Personality Pattern Scales Severe Personality Pathology Scales
Skizoid Avoidant, inkl. Depressive Dependent Histrionic Narcissistic Antisocial, inkl. Aggressive/Sadistic Compulsive Passive-‐aggressive, inkl. Self-‐Defeating
Skizotypal Borderline Paranoid (DSM)
Clinical Syndrome Scales (Axis 1 Symptom Scales)
Severe Syndrome Scales
• Anxiety • Somatoform • Hypomania/Bipolar • Dysthymia • Alcohol/drugabuse • PTSD
• Psychotic thinking (Thought disorder)
• Psychotic depression (Major depression)
• Psychotic delusion (Delusional disorder)
NORMS AND STANDARDISATION: 998 psychiatric patients representing a broad range of demographic characteristics. Age range between 18 to 88 (80 % between 18 and 45.) Most had completed high school. A limitation of the norm sample was a low representation of Blacks, Hispanics and other.
NORMS AND STANDARDISATION: 998 psychiatric patients representing a broad range of demographic characteristics. Age range between 18 to 88 (80 % between 18 and 45.) Most had completed high school. A limitation of the norm sample was a low representation of Blacks, Hispanics and other.
• HUSK AT MCMI KUN ER BASERET PÅ PSYKOPATOLOGISK NORMMATERIALE OG DERFOR IKKE KAN BRUGES PÅ NORMAL BEFOLKNINGEN.
Der findes også dansk normmateriale. RELIABILITY:
• Internal Consistency 0.67-‐0.90 (Cronbach’s Alpha) • Test-‐retest 0.84-‐0.96 over 1 week • Better for Personality Disorders than for Clinical Syndromes
VALIDITY: Development guided by Loevinger’s 3 steps of test development and validation:
• Theoretical – Substantive • Internal – Structural • External -‐ Criterion
ADMINISTRATION: Developed for use with men and women (18+) who are seeking mental health evaluation or treatment, and can read minimum at 8th grade level. NOT meant for use with nonclinical populations! This will result in distorted test results. Can be administered individually or in groups, in a pencil-‐and-‐paper form or via computer. Administration time is usually 20 – 30 minutes. No special instructions are required.
SCORING: The test cannot be scored if:
a. gender is not indicated b. age is less than 18 c. more than 12 items are unanswered
Prototypical and nonprototypical items Scale items are given a weight of 2 when they represent central, or prototypical, features of a given personality or syndrome. Less defining, nonprototypical, characteristics are given a weight of 1. Raw scores and BR (Base Rate) scores Raw scores for all scales except Disclosure (X) are calculated by adding up the number of items endorsed by the scale, being careful to assign proper weighting (1 or 2) for each item. Scores (except Validity) may then be converted to BR scores using tables provided in the test manual. Response Bias Corrections Initial BR scores can then be subjected to 4 possible corrections, compensating for the effects of response biases: Disclosure, Anxiety-‐Depression, Recent Inpatient Admission and Denial-‐Complaint.
INTERPRETATION: Foregår på basis af en vurdering af følgende:
Besvarelsesstil Gyldighed (Der er tre spørgsmål til at opfange snyd i besvarelsen: ”Det sidste år har jeg fløjet over Atlanten 30 gange”, ”jeg var på forsiden af flere ugeblade sidste år” og ”jeg har ikke set en eneste bil de sidste 10 år”.) Kliniske personlighedsmønstre Svær personlighedspatologi Kliniske symptomer Svære kliniske syndromer
FACTS:
• 90 items – 15-‐20 minutes • 5-‐point scale (0 = overhovedet ikke – 4 = i ektrem grad) • Concerning the preceding week • Uses T-‐scores
PURPOSE: • Anvendes til at vurdere den samlede grad af psykisk distress (lidelse), som den, der
besvarer skemaet, har oplevet gennem de seneste syv dage. • Instrumentet giver en profil, der afspejler, inden for hvilke symptomkomplekser
denne distress især har gjort sig gældende SCALES AND ITEMS: Symptomskalaer
• Somatisering, obsession-‐kompulsion, interpersonel sensitivitet, depression, angst, fjendtlighed, fobisk angst, paranoide forestillinger og psykoticisme.
NORMS AND STANDARDISATION:
• Danish norms significantly above the US ones on most scales • All items excluding the Psychoticism scale form a good Rasch scale! (But the
scale correlates 0.99 with the GSI index) • Gender differences: Females are higher on
• Somatization • Interpersonal Sensitivity • Depression • Anxiety • Phobic Anxiety • GSI
• Due to differences in disclosure?? VALIDITY AND RELIABILITY:
• Reliability GOOD • Internal consistency
SCL-‐90
SCORING: Globale Indeks
• GSI (Global Severity Index): Et samlet globalt mål for psykiatrisk
symptomtyngde. • PST (Positive Symptom Total): svarer til antallet af positive symptomer uden
hensyntagen til sværhedsgraden og værdien, beskriver således bredden i symptomfladen.
• PSDI (Positive Symptom Distress Index): gennemsnitsmål for sværhedsgraden af de symptomer, som personen oplever.
Symptomskalaer
• Somatisering, obsession-‐kompulsion, interpersonel sensitivitet, depression, angst, fjendtlighed, fobisk angst, paranoide forestillinger og psykoticisme.
INTERPRETATION: • SCL 90 er ikke et diagnostisk instrument, dvs. der findes ingen cut-‐off score for
forskellige lidelser. • Det er blevet foreslået at definere en ”case” ved en GSI T-‐score over 63 eller
ved mindst to symptomskalaer med T-‐score over 63.
PROJEKTIVE TESTS FACTS:
- 10 tavler, hvoraf 7 er sort/hvide (med undtagelse af røde klatter på tavle II og III) og 3 er farvelagte.
- Passer til alderen 5 år opefter, men bruges mest til voksne. - I et prøvebatteri er det en god ide at starte med de kognitive prøver, derefter
de projektive (Rorschach) og sidst de apperceptuelt-‐dynamiske (TAT) -
PURPOSE: Rorschachprøven (og lign.) giver et indblik i hvordan folk organiserer deres verden og, hvorledes de forarbejder synsindtryk mm. THEORY: DEVELOPMENT: SCALES AND ITEMS: NORMS AND STANDARDISATION: Exners scoringssystem: Udviklet på basis af sammenlignende studier over forskellige Rorschachscorings-‐metoder og statistisk bearbejdning af et omfattende prøvemateriale/data. VALIDITY AND RELIABILITY: Flere studier viser lav reliabilitet og manglende predicitive validitet.
Rorschach
ADMINISTRATION: • Råprotokol • Lokalisation (Noteres ved prøveoptagelsen.) • Min. 14 svar til hele prøven før den anses for valid! • Inquiry (Foretages efter prøveoptagelse og har som formål at få
afklaret lokalisation og determinanter.)
ADMINISTRATION: • Råprotokol • Lokalisation (Noteres ved prøveoptagelsen.) • Min. 14 svar til hele prøven før den anses for valid! • Inquiry (Foretages efter prøveoptagelse og har som formål at få
afklaret lokalisation og determinanter.)
SCORING: • Lokalisation: (W, D, Dd og S) • Determinanter: (Form (+formkvalitet), Bevægelse, Farve, Lys-‐mørke, Par, Form-‐
refleksion, Tekstur) • Indhold: 21 kategorier, fx animal, human, human detail, art, science etc. • Populærsvar: Der findes i alt 23 mulige populærsvar. • Z-‐score: En score for graden af perceptuel integration. • Special scores: 14 forskellige koder for forskellige typer afvigelser i tolkning og
verbalisering. Exners ”the Comprehensive Scoring system”. Antallet af determinanter, svartyper (herunder osv. gøres op ift. hvad der er normalt for aldersgruppen (empiri). Key Variables: Dominante elementer i personlighedsstrukturen, ex PTI (Perceptual Thinking Index), DEPI (Depression Index), CDI (Coping Deficit Index), OBS (Obsessional Index). Cluster Interpretation: Første fase i Exners fortolkningsmetode. Ud fra Key Variables bestemmes rækkefølgen af de clusters man følger i fortolkningen (de mest fremtrædende først), ex Mediation, Ideation, Processing, Affect, Controls, Situation Stress, Self Perception, Interpersonal Perception. Andre scoringssystemer Exners er det mest udbredte scoringssystem, men andre anvendes også. Til vurdering af tankeforstyrrelser anvendes fx TDI (Thought Disorder Index). FACTS: PURPOSE: THEORY: DEVELOPMENT: Udviklet af Murray (1938). Murray var påvirket af psykoanalysen, men udviklede sin egen personologi, der især lagde vægt på motivationens og behovenes betydning. SCALES AND ITEMS:
• 30 billeder hvoraf 1 er blankt. • Tavle 1, 2, 3BM, 4, 10 og 20 anvendes altid • TAT-‐billederne afspejler i høj grad følgende sociale samspilssituationer: Mand-‐
kvinde, mor-‐datter, far-‐søn, mor-‐søn, far-‐datter, forældre-‐barn, trekantssituation (mand mellem to kvinder).
• Der er ingen billeder der afspejler kammeratskabsgrupper, sampspilssituationer i arbejdsmæssige sammenhænge eller andre former for samarbejds-‐ eller konfliktsituationer hvor mere end tre personer er tilstede.
NORMS AND STANDARDISATION:
TAT
VALIDITY AND RELIABILITY: • TAT-‐prøven er en af de mest anvendte personlighedsprøver i USA, på trods af
flere diskussioner om prøvens validitet.
ADMINISTRATION: SCORING: INTERPRETATION:
• Et af de aspekter, der træder tydeligt frem ved denne test er den kognitive stil. • Man kan også med se efter gentagelsesmønstre, ex konfliktløsnings-‐strategier
(ved at se på fællestræk i de temaer, der dukker op.) • Forsvarsmønstre, fx perceptuelt forsvar. Det kan være nyttigt at bemærke
hvilke objekter, der ikke beskrives, opfattes eller evt. fejlopfattes. • Det psykomotoriske mønster (dvs. mimik, gester, motorik etc.), stabilitet og
brud i det. Store uregelmæssigheder eller forstyrrelser kan pege på en form for tilstandsafhængighed.
Anne E. Thompsons analysemetode:
• Analyserer de enkelte TAT-‐billeder ud fra en ”Affect Maturity Scale”, der består af 5 niveauer, hvor 1 er det mindst differentierede.
• Affect Maturity danner en fundamental evne, der bestemmer, hvordan en person oplever og håndterer sine følelser. Dette lægger bag evnen til at tolerere følelser og evnen til at gøre dem til genstand for realitetstestning.
FACTS:
• 70 stimulusord • Alle ord er substantiver, men enkelte kan også
opfattes som verber • Nogle ord er neutrale, andre har aggressive eller
seksuelle antydninger
PURPOSE: Ord-‐associationsprøven er mest velegnet til at afdække regressive træk i tænkningen, dvs. velegnet til psykologiske undersøgelser af psykotiske patienter, især skizofrene. Ikke særligt velegnet ifht. neurotiske og lettere personligheds-‐forstyrrelser. DEVELOPMENT: Indgår i Rapaport, Gill og Schafers diagnostiske testbatteri. Prøven er inspireret af en tidligere associationsprøve udviklet af C. G. Jung samt Kent og Rosanoff. SCALES AND ITEMS:
• Testen består af 70 stimulusord (i den danske udgave). Alle er substantiver, men en del kan også opfattes som verber. Nogle af ordene er neutrale, andre har relationer til aggressive og seksuelle forestillinger og kan give anledning til emotionel provokation.
NORMS AND STANDARDISATION: • Populærsvar • Originalsvar
Ord-‐associations-‐test
ADMINISTRATION: • Der er to dele i prøven. I den første del bliver prøvepersonen bedt om at sige
det første ord, der falder ham ind. I den anden del læses stimulusordet for prøvepersonen igen og han/hun bliver bedt om at reproducere sit svar til hvert
ADMINISTRATION: • Der er to dele i prøven. I den første del bliver prøvepersonen bedt om at sige
det første ord, der falder ham ind. I den anden del læses stimulusordet for prøvepersonen igen og han/hun bliver bedt om at reproducere sit svar til hvert ord.
• Reaktionstiden registreres og svarene nedskrives under prøven, samt evt. andre reaktioner, der måtte forekomme under testningen.
• Inquiry foretages straks efter reproduktionen. Her spørges ind til ord, der har fremkaldt særlige reaktioner hos prøvepersonen.
SCORING og INTERPRETATION: Prøven fortolkes ud fra:
• Indholdet (ex mulige bevidste og ubevidste konfliktområder og associative temaer)
• Formelle karakteristika o Nære associationer: drejer sig fx om en gentagelse af stimulusordet,
selvreferering (ex husmit hus), klangassociationer (ex hatkat), alliterationer mm.
o Fjerne associationer: hvor der tilsyneladende ikke er nogen forbindelse mellem stimulusord og reaktion (ex bogkalkun), hvor der kun er en svag sammenhæng (ex dansspise, hustomt), eller løst generaliserede reaktioner (ex stolhus).
• Forlængede reaktionstider og originalsvar (jf. Ivanouw). Ivanouw mener, at prøven bedst kan bruges til at afkræfte mistanker om alvorlig psykopatologi og at man for dette skal se på kombinationen af forlængede reaktionstider og et antal originalsvar under 20. Et samtidigt antal populærsvar på min. 14 vil endvidere være tegn på fravær af alvorlig psykopatologi.