teacher education; quantitative tests; reading tests ... interpretation; test reliability; ......

DOCUMENT RESUME

ED 248 190 SP 024 016

TITLE Florida Teacher Certification Examination. TechnicalReport 1981-1982.

INSTITUtION Florida State Dept. of Education,PUB DA'P'S 84NOTE 57p.; For r ad documents, see SP 024 017-020.PUB TYPE Reports - De.. iptive (141) -- Statistirail Data (110)

EDRS PRICE MF01/PC03 Plus -'ostage.DESCRIPTORS Higher Educatio; Knowledge Level; *Measurement

Techniques; *Minimum Competency Testing; PreserviceTeacher Education; Quantitative Tests; Reading Tests;

/ Standardized Tests; State Standards; *TeacherCertification; Tiathing Skills; *Test Construction;Test Interpretation; Test Reliability; *Test Results;*Test Theory; Test Validity; Writing Skills

IDENTIFIERS *Florida Teacher Certification Examination.

.

ABSTRACTThe Florida Teacher Certification Examination (FTC!)

is based Upon selected competencies that have been identified asminimal entry-level skills for prospective teachers. k description isgiven of the four suttests which make up the FTCZ: (1) writing--essayon general topics; (2) readingmultiple choice "clove" procedure ongeneral education passages derived.from textbooks, journals, andstate publications; (3) mathematics--multiple choice questions onbasic mathematics, simple computation, and "real world* problems; and(4) professional education-7multiplirchoice quettioniron generaleducation (personal, social, academic development, administrativeskills, and exceptional student education). This technical reportprovides information on the test's creation and assembly,administratira procedures, and scoring and reporting.-A tablepresents statistics on the number and,percent of students (1981-82)passing all subtexts, by education mayor or program. The psychometriccharacteristics of validity, reliability, item discrimination, andcontrasting group 4erformance of the FTCE are discusaed. Appendicesprovide information on: (1) essential competencies and subskillstested; (2) mathematical illustrations of formulas; (3) security andquality control proceduresp.and (4) scoring the writing examination.(JD)

************************************************************************ Reproductions supplied by EDRS are the best that can be made *

* from the original document. *

********************************************************4.**************

Print your full name hers:

(last)

-T

EST

BO

OK

NU

MB

ER

0000(first)

(n%111$1111)

TE

ST FO

RM

CO

DE

8A

X %

N:: 4 ;.c.;,k %

It

131.01411Mm

i os TE

RU

DA

TIC

EI

NA

TO

PUK

MU

IRT

UT

E O

ED

UC

AT

OR

ED

UC

AT

MA

L R

ESO

UR

CE

SN

IFOR

YY

IA M

AI

CE

NT

ER

Watt.

TfR

5 dot .amosti

01.111 inraPPIX

IMN

IItieelw

ww

wte hum

the mew

l Cs

cwoontinon

tiA

lma cliprw

m, N

a ve bornmai

eproodkno qualm,

%A

nis of Vire is

uppnsur staledfin ttry deC

ii.m

ane do net nece4 riopesemW

INK

* NN

Epotshot+

ispallet

*%::: "PE

RM

ISolohl TO

RE

PRO

DU

CE

Me

....vv.%M

AT

ER

IAL

HA

S BM

GR

AN

TE

DSy

vs>.

4:,),Nt

4:,:;%\:

NA

,W

O.*

State of FloridaD

eportment of E

ducationT

eacher Certification Section

Tutlahossee, Florida 32301

Ffalph D. T

uribmion, C

omm

issionerA

ffirmative action equal opportunity

employer

011111111401SO

W el PIM

&iblerlim

e Oar

4 Sell.2

Questica relative to this report or the Florida* Teacher CertificationExamination may. be directed to the staff by calling 904/488-8198 or bywriting:

Student Assessment SectionDepartment of Education,.Knott BuildingTallahassee, Flori 32301

State of FloridaDeportment of EducationMohnen.,FloridaRalph D. Todegion, CommissionerAffirmative action/equalopportunity dyer

FLORIDA: A STATE OF EDUCATIONAL DISTIPICTIOP. "On etatewhle average. educationalachievement in the State of Florid, will equal that of the upper quardhe sd' states within fiveyears, as indicated by cosausenly ae...;pted criteria of Ettidninent."

349-020784-MCP-200-1

f

FLORIDA TRACI=

CERTIFICATION RRANIMTIC01

TECERICAt REPORT

1981-1982

Student Assesmmea SectionBureau of Program Support Services

Division of Public SchoolsDepartment of Education

Tallahassee, Florida 32301

CopyrightState of Florida

Department of State1984

4

TABLE OF CONTENTS

Background

Desiription of tbi Examination 2

Calibgation of Items 4

TEST composrpm, ADMINISTRATION, AND SCORING 5

Test Creation and Assembly 5

Administration-procedures.4.

6

4Scoring and Reportinga

7

TEST RESULTS TOR 1981-82 9

PSYCMOKETRIC cRARACTERISTICS 15

Validity 15

Reliability 16

DiscriminationI . 19

Contrastive Group Performance 23

SUMMARY t 27

APPENDICES

A. Essential*Competencieq. and Subskills 29

B. Mathematical Illbstrations of FormuAas 33

C., Security and Quality Control Procedures 37

D. Scoring the Witing.Exarination 41

5

INTRODUCTION

Background

Each applicant for an initial Florida teacher's certificate must pass the

Florida Teacher Certification Examination (FTCE). The FTCE wee established by

Section 321.17 Florida statutes and is administered by the Florida Department of

- Education. sb

The competencies that fo& the basis for the Florida Teacher Cietificatiom

Examination were identified' through a study conducted by the Council on Teacher

Education (COTE)1. As a result of the study, twenty-three Essential Generic

Competencies were established upon which to base the Examination and to form a

part of the curricular requirements at Florida college. and universities with

approved teacher education programs. Later legislative action combined two of

the competencies,. numbers six and nineteen, and created an additional competency

dealing with education for exceptional students.

f An ad hoc task force convened by the Department of Education developed

subskills for the identified competencies. The subskills were4reviewed and

critiqued by various individuals' and organizations including a random sample of

certified education personnel, statewide professional teacher organizations, and

all colleges and universities with approved teacher education progress. The

twenty-three Essential Generic Competencies and the subskills are listed in

AppendixA.

Teat item specifications were written for each sub.skilt. Specifications

are rules and psfameters for writing test items to :lessors a particular

subskill. They provide information such as the length of the stimuli, the mode

of the stimuli (graph, problem situation, mathematical algorithms), the

characteristics of the stem (question, statement completion), the

characteristics of the correct answer, and the characteristics of the foils.

The specifications also include detailed information'about the content upon

which the testa are based. The complete specifications bra contained in the

Florida Teacher Certification Examination Bulletin II: The General Education

Subtests -- Reading, Writing, Nathemetics and brothe Florida Teacher

Certification Examination Bulletin In: The Professional Education Subtext.

Copies are available from the.Department for a nominal fee.

Passing scores for each subtext were recommended by a panel of judges, all of

whom were either current or past members of COTE and who had been involved in

*the development of the Examination.- The panel was made up of.classroom

....11...ren

'COTE was a statutory advisory council appointed by the State Board of

Education to advise the Commissioner of Education on all matters dealing with

teacher education arid certification. COTE was replaced by the Florida Education

Standards Commission in 1980.

Is

teachers, school administrators, teacher educators, and community representa-tives. Passing score recommendations were made to the Commisiioner of El!wationfor each subtest. These recodmendations were adopted'as a rule by the StirBoard of Education on July 30, 1980.

4110

The operational tast,,i of prepiring test forms, administering the tests, andaco;ing the answer sheets are completed through an external contract. Thecontract for these tasks was awarded to the University of Florida Office ofInstructional Resources'for the three administratiOnsof the 1981-82 schoolyear.

Periodically, contracts are issued for the development of additional testitems. New items are needed to maintain a large poolpf high'quality and securetest items. A large ited pool makes it possible to develop alternate folios ofthe test so that an examinee, who retakes a subtest will receive a new set ofquestions.

All 4"-imi development is subject to the restrictions of the itemspecific's-tons. Test development contractors must provide intensive itemreviews aii conduct pilot tests of the items. Following this, the Departmentinvites a panel of college and university educators to review the new items.This reriew consists of a critical reading of each item for possible bias,adequate subject content, and adequate technical quality. After the new itemshave been thoroughly reeview40 and revised they are field- tested by imbeddingthem in a regular test formfand administering them to a sample of examinees.The item difficulties are calibrated with latent trait techniques and equated toexisting items. Later forma of the FTCE contain the new items.

Description of the Examination

The FTCE is ad6inistered three times a year at sites throughout Florida.The test takes an entire Saturday to complete. Examinees usually receive theirresults within one month. Examinees who'fail any part of the FTCE may retakethat portion at a subsequent. administration. The FTCE isa written testcomposed of four aubtests. The characteristics of the four-aubtests aresummarized in Trble 1.

72

1

More detailed information. on FTCE administrations is contained in the

Florida Teacher Certification Examination Registration Bulletin. This booklet

TABLE 1

A Descriptipn of the Four Subtestaof the

Florida Teacher Certification Examination

Competency Type of

. Subteat Tested -Questionp

Writing 2 Essay writingproductions

.RLding 4

MathematicS 5

Multiple choice"clone" proce-dure

Multiple choice

Professional 6, 7, 9 -1t4, Multiple choice

Education 20-24 (problem solvingapplicationlevel)

Content

General topics

General educa-tion passagesderived fromtextbooks, jour-nals, statepublications

Baalc mathematics:simple computa-tion and "realworld" problems

General education(personal, social,academic develop7'sent,, administra-

tive skills, excep-tional studenteducation)

Scoring.

Balleticscoring bytrainedexperts

Objectiv

C".

Objective

Objective

The Writing Subteat is scored holistically (general impression marking) by

three trained judges. The scoring criteria include an assessment of the

following:

1. Using language apprppriate to Ore topic and reader

2. Applying basic mechanics of writing3. Applying appropriate sentence structure,4. Applying basic techniques of or6anization5. Applying standard English usage6. Focusing on the topic7. Developing ideas and covering We topic

3D

is available free from Florida school district offices and from the Departmentof Educatfon.

Four bulletins have been developed to provide information about thedevelovAent of the FTCE. The subtest and item specifications have been-

.

published in Bulletin I: Overview, Bulletin II: The General EducationSubtesto Reading: Writing, Mathematics, and Bulletin.III: The ProfessionalEducation Subtest. Bulletin IV: The Technical Manual describes the technicaladequacy of the first examination. The first three bulletins were distributid toall Florida teacher education institutions and school system.personiel officesin the fall of 1979. Bulletin IV, designed primarily for measurement.professionals, was published in 1981. An overview of the coverage of the FTCEis provided in Appendix C of Bulletin IV. An annual Technical Report isproduced to describe the psychometric characteristics of the three testsadministered during each academic year. This report covers the 1981-1982Examinations.

1.

Ranch Calibration of Iteins 4

Calibration oit items is conducted using Ranch methodology and the BICALcomputer program. The Rasch mode bases the probability of a particular scoreon two parameters, the person's a 1 ty Lod the item's difficulty. The model isexpressed as:

p fXvi I Bv,S1.1 exp IXvi

(IIv

- t )1 / fl + exp (Bv I ) Ii

in which Xvi

= a score

B = person ability

ni 1.tfic difficulty

Estimates of person ability and item difficulty are obtained using 'maximumlikelihood estimation as described in Wright, Mead, and Bell (BICAL:Calibrating Items with the Reach Model, 1980).

4

The process of obtaining item difficulties for new items involves fieldtesting experimental items within regularly administered test forms. Multipleforms for each administration are comprised of sets of scored items in each formand different sets of experimental items. A subset of the scored items forms acommon' link between torms. The new items are calibrated to the same scale asthe regular items. All items are then linked to the base scale of November 1980by a linking constant. This linking constant is the difference between theaverage calibration values for the common items in November 1980 and their meandifficulty in the current adminAstration. A description of this procesi can befound in Ryan (Item Banking, 1980).

Following each administration, the data are randomly divided into threesets of 700 candidates each. Candidates are assigned in sequential order to theappropriate data set. Calibrations are conducted on oe data of the candidatesin each set and the mean difficulty 'values across the data are calculated foreach item.

4

TEST COMPOSITION, ADMINISTRATION, AND CORING

Test Creation and Assembly

The items contained in the Department of Educatitititem bank are calibrated

and equated to the base scale established during the April 1980 field test. The

items are given identification codes and detailed information on the item usage

is maintained including the identification of the form on which each item wa

used, the difficulty value, item point-biserial correlation, and Reach fit

statistics for each item.

Each test form is designed to ensure that the items (a) fit the item

speatifications for the skill that they were designed to measure, (b) conform to

the test specifications in number and type, and (c) represent a range of

difficulty with a mean difficulty approximating zero logics.

A test blueprint is prepared for each form. Items are selected snd

subjected to content, style, and statistical reviews by the Office of

Instructional Resources at the 12piversity pf Florida and by the Florida Department

of Education. Test items are sgteened for content overlap.

Placement of the items on the test is primarily a function of appearance

and content. The order of the items is ndt related to their difficulty. Items

are grouped together if they are similar in editorial style,,directions, amd

question stems.

Experimental items are field-tested within each subtest but are not colotted

in a candidate's score. When multiple test forms are used, the core of regular

(scored) items in each form remains the same for any administration. Test forms

are spiralled so that each test center receives approximately the same number of

each form. In this way, all experimental items are field-tested by at least 400

candidates who represent a cross-uction of the people who take the Examination.

Once the form has been approved, the scoring key is verified. Staff

members from the Department of Education, the Office of Instructional Resources,

and three te Thera from the public schools take the Examination. These persons

are also asked to identify any ambiguous items or confusing directions.

`Camera-ready copy is prepared by a test specialist and a f;raphic artist.

Attention is paid to the proper placement of items io provide workspace where

necessary. The camera-ready copy is again critiqued by the staff in the

Department of Education and the Office of Instructional Resources. Corrections

are made, the copy is dent to the printer, and a final check of Lae pro9f is

made before the tests are printed.

5 0

Administration Procedures

Examination Dates, Times, and Locations

The FTCE is administered in the fall, winter, and summer of each year.Administration dates for 1981-82 were October 31, 1981; February 27, 1982; andJuly 10, 1982. Candidates were permitted to take all four subteets or anyeubtest previously not passed. Thirteen locations'in the state were designatedas testing areas. Specific sites within each area were selected as testcenters. These centers were selected from the pool of established centers forthe administration bf stAdardised examinations.the 1981-82 administrations were:

Designated test locations for

1. Pensacola 8. Miami2. Tallahassee 9. Fort Myers3. Gainesville -10. Orlando4- Jacksonville 11. Boca Raton5. St. Petersburg 12. DeLand6. Tampa 13. Lakeland7. Sarasota

All test centers were inspected to ensure that the rooms met the requiredspecifications for lighting, seating capacity, storage facilities, airconditioning, and protection from outside disturbances. All facilities wereable to accommodate handicapped candidates.

The test schedule is divided into morning and afternoon sessions. Testingtime is fixed but allows adequate time for candidates to complete all sectionsof the Examination. Candidates may continue to the Reading Subtest after theyfinish the Mathematics section. The schedule for each subtest is listed below:

WritingMathematicsReadingBreakProfessional

Education

45 minutes70 minutes50 minutes60 minutes

150 minutes

9:00 a.m. - 9:45 a.m.10:00 a.m. - 11:10 a.m.11:10 a.m. - 12:00 noon12:00 noon - 1:00 p.m.

1:30 p.m. - 4:00 p.m.

A security plan has been developed and implemented for the'prograp. Refer toAppendix C for further information about security and quality control.

Special arrangements are made as necessary for handicapped candidates. ABraille version of the Examination is available. Typewriters or a flexible timeschedule is permitted for handicapped candidates.

6

g

Test Manuals

Uniform testing procedures were established for use at all centers

0 throughout the state. Documentation of the procedures is available in the Test

Administration Manual for the, program. The administration manual includes the

following topics:'

1. Duties of the Test Center Personnel2. Receipt and Security of Test Materials3. Admission, Identification and Seating of Candidates

4. On-site Test Administration Practices and Policies5. Timing of Subtest Sectionsb. Instructions for Completing Answer Documents7. Special Arrangements for Handicapped Students

Additional information to candidates is found in several other sources.Candidates are notified about Examination requirements, locations, and

procedures in the itraistrati6n bulletin, Specific directions to candidates.-

about the assigned teat center and necessary supplies are printed on the

Admission Ticket.

Scoring and Reporting

Scoring,

The scoring princess begins with a hand edit of the answer sheets, followed

with the scanning of an initial set of nheets to verify the accuracy of the

scanner, the key, and the scoring programs. The remaining sheets are scanned,

and the data are divided into, sets of 700 candidates. Three data sets are drawn

using a systematic random sampling method for the calibration of items using the

Basch methodology. The items are adjusted to the base scale established by the

April 1980 field test. A score table of equivalent raw scores to ability logits

is calculated and used to determine the ability logits fon the remaining

candidates. Each person's score in ability limits is then transform.: to ascale score with 200 as the minimum passing score. For a discussion of the

procedures used to establish the cutting score see the technical discussion in

Bulletin IV.,m

The essay is rated by three readers who use a four-point scale defined in

State Board of Education rules. The resulting scores range from thie/e to twelve

points. The passing standard is set at six points. Details of the criteria for

the rating of essays are available in Bulletin II.

Reporting

The reports generated for each administration include a candidate report

and score interpretation guide, reports for institutions, and state-level

reports.

Candidate reports indicate whether or not a test is passed; scaled scoresare reported only for testa failed. Scores above the passing standard are notreported. However; candidates who fail one or more tests are provided theirscale score for each aubtest failed. A detailed uplysis of performance isprovided to individuals who fail the Professionserducation Subtext.

The reports generated for the inctitutions and the state are listed below:

1. Number and Percent Passing for:

a. Each subtextb. All four subtextsc. Three, two, one or no subtexts

2. Number, Percent Passing and Me.. Scores for Each Subtext and theTotal Examination by All Candidates and:

a. First-tiTe candidatesb. Re-take candidatesc. Vocational candidatesd. Non-- Icational candidatese. FlorJa candidatesf. Non - Florida candidatesg. Florida candidates from approved degree programsh. Florida candidates from non-approved programsi. Sex and ethnic categories

3. Number and Percent of Candidates by Florida Institutions and byPrograms, Passing All Subtests and Each Subtest

4. Number and Percent of Candidates Passing All Subtests and EachSubtest by Program Statewide

5. Frequency Distribution for All Candidates for Each Subtest by Sexand Ethnic Category,

6. Frequency Distribution for Each Subtest for Florida Institution

Statistical analyses of data are reported in the sections on thepsychometric characteristics of the Examination.

TEST RESULTS FOR 1981-82

Results for the three test administrations2 in this report are summa-

rized in this chapter. The overall passing rates are shown in Table 2. As

can be seen ram the data there are no differences between first-time and

all cand ida es for the first two administrations, and two percentage points

higher p ormance for first-time takers in February 1982. The February

1982 inistration was thr'first one that showed a large eff- - from

retakers. This effect is not unexpected.

Table 2

Percent of Candidates Patsing All Subtestsof :'the

Florida Teacher Certification ExaminationAugust 1981 - February 1982

First-TimeCandidates

'All

Candidates

August 1981 , 80 80

October 1981 84 84

February 1982 86 84

Table 3 through 5 on the following pages show the number and percent

. passing each subtext and all subtests for: (a) approved program candidates;

(b) non- approved program candidates; and (c) vocational technical candidates.

2The August 1981 administration was a part of the data for the 1980-81

Technical Report. it is also in this report because of a decision to make

the Technical Report coincide with the saiTe test administrations that are

used in calculating the eighty percent report for approved programs. The

eighty percent performance report is calculated from the summer, fall,

and winter administrations.

9

14

15

BEST COPY HARARETable 3

Florida Teacher Certifichtion Examination Number and Percent PassingAll Subtests and;Each Subteseby Program Statewide for

Approved Program Candidates

August 1981; October 1981; and February 1982 Administrations

t ADMINItTUATION/UURERMISION4 ART ECUCATIENS 81OLE6 8uSINE55 CONCATIUN14 EMILY C1411.0h0U0 FU.13 VARYING ERCERTI1tNAL1712514 NEARING 0154011.17IESIS VISUAL DIIACILITIESI? 411414L ntutsuorsom18 SPECCh CURRELTIUN20 ELEMENTARY EDUCATION ,21 414C4.1144

!1 HI Am. 114 LOUC 41111424 VuCAT1114AL KOIL. CLONUNiC546 14011.S1. IAL apt

INAA'SPCOTATILN.14 .11404411C *117511$ 1.k4 *Ch

',PAN1SH17 LAT It.11 4LNSAh4' CuUCATIJNAL ',moot sptctoLisrAS 111114*.KATICS4b 01,1102 LAMAIIUN0 oftYSICAL C944411411

C) 41. SCIENCES4 COWNISTRV51 PHVSICS52 01111.06056 SOCIAL STUDIESST HISTORYSO PULITICAL SCIENCE60 SOCIULGGY62 SPLECN64 VISITING TEACHER70 ADNINISTRATICN* AOMLI Cif74 PsirCHCLSV

INSTH6NLUTAL MUAC14 ALPI1 Itft,41100

auMAN1111,534 IECH41CAL EE4CAT1ON45 wcC1ALIST IN SCHOOL PSY.46 READINGSJ 444SIC V4CAL/A EARLY calt p. C0 0$.X'4. ED.04 LLCO/EN4T1131tAL DISTURBANCE

1LEMIPA4NING CI5AUILIT406 tErmeMENTAL NETIONATIEIN91 LIASIMIa SPEC. WWII 01';4014.A4 LvaItENAktrEATALILLARN/VARYI 615 1.oc71 ctmeou.TAL/!..Prctric9? imUT1CA4LI5s1C1rICe4AUVING

pag fprf1CAAL RIUTURUANCL20: SPECIFIC LCANN. 91SARILITIL.&OS 4tALTI- OCCUIFATIVit, C3.201 ENGLISH,FRENCH240 L6CLItt.,511CIAL URPIES 17-S1210 LNCLISN'SPLECH!lb THENCNFJPANIEH22.2.07 INOUSTRIAL £0,171 .11aor SCIENCE /MATH2S5 491.1011 NIGNI,1941.: ScHjth4/2DU CMWMUNICATICA26.f CUOC4I1441306 NEALTN $ERVICES

TrITAL

Lap1 100.0

11 TO 06.42 2 100.0

59 50 84.?24 21 07.8

1 100.06 5 03.3

10 0 00.0175 144 00.0tr 17 100.0

*601 1242 .84.5121 124 96.102 24 94.0to 10 90.03ft 32 00.9$7 12 70.6

I 100.05 100.05 00.9

11 IS 711.9I 1 100.0

1 100.020 IF 11b.072 66 91.7

Ill 95 46.42110 indll 74.0

9 100.02 2 100.01 I 100.0

35 33 94.3104 98 44.211 V 64.66 * 66.71 0 0.04 180.0

2 100.01 $ 100.0

11/..53 100.0

1 1 100.01 1 100.0

1 50.01 100.0

6 6 100.01 1 100.0

ION 04 86.25 4 80.0

21 20 05.23 100.0

6 6 100.0,1 $ 100.022 18 01.0II IR 100.064 S6 07,5ibi 156 94.5

1 100.0$ 100.0

2 2 100.02 109.0

1 I 190.0ft 2 100.0

100.0I 149.0

1 3 140.01 50.0

80.4$00.0

suiviresipsBrie .1

100.082 70 90.12 2 100.4

60 86 90.02. 22 910

1 1 100.06 6 100.010 10 100.0

177 41610 94.917 17 $00.0

1413 1325 8,301129 120 49.222 12 100.020 20 100.0.16 34 44.414 IS 83.3

1 100.05 5 100.09 9 100.0

19 IV $0.91 I 100.01 1 100.0

20 20 100:073 7111 95.9

113 10 92.0268 251 87,2

9 100.02 2 100.0

1 190.036 35 97.2105 195 10001I 11 100.06 6 100.03 2 66.7

100.02 2 100.01 I 100.0A 7 0753 3 100.01 I 100.01 1 100.42 2 100.01 1 100.06 6 100.0A 1 00

149 101 192.700.0

.0215

214

1003 3 %OOP*6 6 100.0

100.022 21

%95.5

11 18 100.065 41 95.1166 leo 98.01.0

1 100.02 2 100.2 10000

0

1 100.0100.04 *00.0I 100.0

3 3 160.0a a MAISO 9 90.4,3 3 100.0

1 1 100.001 78 96.32 2 100.0

39 54 90.921 .a 1 10000.0

0

6 5 03.31.8I TS0

10999006.0

IT 17 104.01012 1346 95.3129 119 100.022 22 100.020 14 90.036 3 94.419 13 64.4I 1 190.05 5 1100.010 10 100019 17 40.9s 1 100.01 I 100.0

20 20 100.072 07 93.1

111 105 94.6247 254 68.59 100.02 2 100.01 1 100.0

15 34 91.1IDS 100 95.211 0 11.1 11 0 01.86 5 03.3 6 0 83.33 -1 33.3 4 1 00.9

4 100.0 4 1410.12 2 100.0 2 2 100.01 1 me.* I 1 100.0

44401 . A 100.0

114 VG 90.5ONO/

40 50 941.72 23 sus1 1 100.0

100.010 9 90.0

177 102 01.5111. 10 100.0

$429 1208 91.0129 105 90.923 21 91.320 49 05.036 33 91.718 16 80.9

1 1 100.45 S 100.0

04.914 IS 70.9

1 1 100.01 100.0

24 IT 85.072 71 98.6

III IDS 02.92900 249 86.9

9 0 100.2 100.411 1090.0

35 34 701108 107 09.1

9 9 100.0 03 3 140.0 31 1 100.0 1

s 1 100.0 I

2 $ 00.0 2I 1 100.0 1

6 6 100.0I 1 100.0 I

110 104 96.5 113S S 100.0 0

21 21 100.0 213 3 100.0 30 100.0 6I I 100.0 I $ 100.0

22 22 100.0 22 SO 06.414 14 100.0 15 10 100.065 60 92.3 65 42 95.4165 161 91.6 168 163 97.0

I 1 100.0 1 1 100.01 1 100.0 I 1 100.05 2 100.0 2 2 100.02 2 10440 2 2 100.0I 1 100.0 1 1 110.02 1 100.0 2 a 100.0

4 100.0 4 4 100.01 1 10000 s 1 100.0

. 3 3 100.0 3 3 100.00-- 2 2 100.0 2 1 00.0

O 0 100.4 10 0 00.43 3 00000 $ 3 200.0

7 87.531

100.0100.0

1 100.02 100.01 100.06 100.01 100.0

100 80.55 100010

20 95.23 100.06 100.0

2141E-11111

2 14097.0SR 00 00a a 1900.0

4 0 94 00024 2 100.0

0I 109.0

00.41/11 110 14 1100.0

175 170 10009.17 IV 100.0

141100 13411 90.9aro 127 90.422 22 100.020 20 100.036 35 97.217 16 90.11 I 180.05 3 100.09 9 100.419 1 44.23 I 1000a a mho

20 24 'mos12 71 011.4113 107 94.1too 149 9260

9 mule2 2 109.41 1

35 30 108.00:00104 104 100.0It II 110.0* 1008..0 0

3 010000

2 2 1110.01 100.0

O 7 1.113 2 $04.01 I. 100.01 I 100.02 2 100.4

I 104.0t 100.0

$ I 100.0A10 re9 99.1

5 9 100.021 21 100.0

3 3 $10.04 180.01 100

22 22 104..to 10 1026064 63 0644165 163 9200

I 00.0I 1

M00.0

2 2 124.02 2 100.01 1 10900142

100000.0

3$ 100.0

3 140.0a It 100..0

11 II 10603 3 160.0

Table 3 cent id.

PROBRAR TOTAL.

UP 14 X

SUR TES TS

Wig N X 161+11 N X INA Si Xeafg...sh

X

370 L 1 IVA V SC MCC 3 * 67.7 3 3 100.0 3 100.0 7 i. 03.7 3 3 114111

374 MASH C AFF A IRS L SERVICE 1 1 150.0 I 1 100.0 I 1 100.0 I I 00000 I I 11111.0

376 SIM IAL SC 'EKES 0 3 70. 4 4 100.0 4 3 75.0 4 3 75.5 S 100.1

379 SE CON; ART SCIENCE 3 101..0 3 3 100.41 ' 3 3 100. t, 3 3 100.0 3 3 200.0

OW I NOUSTR IAL *DUCA TIUN 6 4 1,16. 6 5 03.3 6 II 53.3 6 0 03.3 0 0 100.0

035 D ISM IBUT !VI MCAT ION 7 It 100.5 7 7 10010 IP 7 1110.0 7 7 100.0 7 7 100.0

17 BE CON4ViiiiA.ILE 18

a

CZ

Ck?

.111Till%ItillItt111111:111111tlerrett1111111%111,111114:14:11114141.11/Ittl

000000.0000004100000.426002m.40..0014.00.0.4000.0.0000+0001000O.00040:0

000000000000000%0000246000046000000,000000000000000000.0000 0000100

... .

.6w

.0sow

wa

....... D

seeO

EM

OE

Meeeeee

W006101145.41nO

WA

IPO

OD

OM

OM

WS

0.00.4~0W

0WW

,*ftelanO0W

alieNO

IRA

WIP

PeeeW

OO

MM

0400WW

ftWO

W11100

5N

M!

MO

ftne0e

04..

Ofte04 m

e,.5

..

.V

W.s

.

Ili WW

WV

000°M414400000002100.101%

.00.,MO

NW

MO

wN

0M0010.00w

essw.w

..00wV

NN

WO

Otw

mO

Meow

Megiaft sow

es 0 M000m

M M

0004% M

in 0 s%en

ate

S .0.04m0ftse....04.000000400,1000.00004.040.0400000000.0k000000..004.400

04..tispl.

00.0000Y000fe,00M00014,00011400004001,040.40.00.0000000.000000...w.4e40

00,00.04000VIN04011%.0.0b.00000.000000.,..00.000000.0.000000000 0.0000000

weelle

AM

eeM

eesefie

MI

eleO

MM

.r . .... ...

.... .

.

IrrkeltO4111411011O

rnIO0q2960F

rna01160.fesomed06P

ttrOftitoeftrIftaltittO

01041titseNteat......

. -....

m.o..

.fki ... . omen 0 soft

se

...

0.

ft

00m

mir...s..p.............,......m

...,..m..m

..0.10ftm.m

m...t..........m

.r.t.so

041.

...ma --. . ft 44,... c. v... .m 0

Nm

.

4 OA0

040

tt

*pi44

440i

041 w

O 0 ..... *O

M00000060000N

0W600000000000N

OW

0000N0w

00000010000000000000t111111111ttl0000b0000000000000000040000000000N

New

OO

NO

N000g000000000000000096000.400

a4

- inamm

ipmem

m.m

p.mm

mem

..m....m

mm

ft...........m................fteftm

...

................................emm

omm

emm

mm

mem

mom

mom

omm

emm

emm

.. .....

Wee

Me

we

aleO

ftm

ee MW

eeMee W

OO

M ..... W

M elk

40..g

.rt CO

. ...

.0... .

. .0 .04*

0/01440, 04 0

..

..

?: in

.

itu.

.

CI=

0arat04111.0A

lPt.em

ett,424.WW

,100**010norn,rotA.N

tAfinflrnireas,O

rnrnt«rnirnOrnO

esttt061001ftW

14

NP

lasmalltir,O

1 mou

mg006140.

4.141.M 4 .

..

..

.6

Ili

2:1,.

1it vi

104

24g illC

C;

L414e

i .01

0$41

611-4

0raI

fird

11

Ilii

'12

1

W O

OSSO

OS

tts...***.so.***. ........ epsp...4.6.. pi. .1

.tt°41111111W47.11

AO0W00.010Mmmm.0000,600040210.01,004.01400000M000t0000

0

SingIVMS11;014::::;:221=1:nataira:::::::::11:113:Urging:Ving

4m m m

gooloowmor

Ww

w ......... se

sowW

el Mw

olion-

ww

ee

b. 0Z

NN

N00,0N

ewN

01000000000%00100000w

00400M0.0w

0MO

nMw

eNw

0w0w

w.m

emm

tbegiO

Wesw

we rim

sen w

wW

04P w

.09000w0

soft 0 ww

esw

wO

wm

W

:a4itii

MftftW

00,61n0wM

0Ne0000w

000.01614000M000,00W

OM

ON

0MO

MM

welew

0ww

lww

w0w

wiw

aNt404000000

Yffii

0 Mess*

w00000 00e0

N .01 .w

00100401s win 0 w

v

ww

ww

114

'es=E

cla

r4,

UO

's410

0.-

3k

000etemmmeenotetet.06o6,0p.seo,00414564400ft0-tm70ot..16000006atece0060%**000.00

tot I,

IW

FZ444iii4i444J:VJA444:44;47iligl444%14414i44;4ZOWZ4Z4ZZeU4444;44JiiiZ

1444 1 0ita

0000

0,1460 %%.0.0001),.%*

%.70.N0.0.P.000 0000.004Chwet020 Genob.

...a .

..

..

...... ...

.... .0 .

tt0Z

gatstim0.0.410-0m41,..24414.01.....0041.mwet.mo.m......mm.......00.......N....00

8 Itm

....,...

. . ..,... Amy,. 0

sorta

..

..

.

W4%

41

1.f U.4

a*N

O1P

14.441,60,00404erna,fe00.tee.lokor"CO

04MN

.e..NW

..iNth,040,0184010......P

...rnrnmecrjelN

tik....0.0

CC

*N

..m000

N ...... 4m0... .m 0 .

..

..

..

.0 0

Ch

s.t..

U 02

-P

.4

2.444

Et:g

Lf0

reel

taa

irc.

. -

i7SNi

S.

5..

.5.t.

0..J.

.

szJa

!iiil

wi

04

Iii

..

va

o).

..

t4

0

-ttbb

EMI

0.

sVvW,*C

.,00

,a

025e4

7.21ti

tesS.s

to.. 13

g c-tag

a 4111.442:p.

0tJA

'44

4g

w04

1tgj2nat=:

4 aw:4.^5

UV WS6=

42 . 0

ai.0

ow rA

.w

....0.

..

.4

1'0

0 mw

gd.....s

civo

4.44.

Vi

m 4

iii

ore

=490

0 Ja 2

dam

ig

0.w.r...4.

...I

4 0.:01a0

aZ terow

bouri

,W e.to

.et

AIW..4"4461100111

W00 40,44

vug a

0i.o.fla0

11.20 22 w.Auw

we

ta 4

44110..

4.1J4

..,,,

We

id41 M

eN

Oeve

.4.40.

tESV4.314Zikielet:1,6

.It!lac

..

. 04 041M

u w

-J040

a 0

ut. .

ei

. WzAme0.4.Dr!

°

Wwwo

92

i...0.w22

n40

I.jos.*

7t,A2W

000 1p"44 r."M

440,#141 22.4.44'1..

ea: **.it

sowv..

L.6..i t

o et..::

,...1..+.

14. 2Tortuloas.:: Ou234 A0

.totoid4

.21. w...6 2 ..w.w.... .umWe

ismam

i..tOt

.w.g.%%.

414w 2 u

mm00 VW 44402042'! :,1712 WV0W44X4114d104441444X 4

I44.Z7.t41144a1.1.40LW

Zt41wW

look04j4.04,*10*..A V'44WW'V".N.2000,4.10.1e22Z"0444S.. -toi2Pt ost

..ot

A-t

logal..ix.a44

a ziesuz.44.1a0.1.4....z.u6....64.mvcog $.40..J.x.4.-4.mg.vww441

awit.bm0 .4.2kAup2..w22 w..2.0.m.v.0u..tUww..41.agw4y

81.7,WCw244X1E031440.

clawa.a

41+114.jez.Aajd 2m4042zwzz-=-z;d=.41,.3041a44=42041/Lia.M..w.a.JOaa..awa

44b44ataM4X1102Ati42:10:.1211Mitt.I.ZOIttAi0.2:11.415.42122,0a44..tie44.%11.2r

2,

44.101,0110fte0"an4NIM00."0W",000VN4NOftVV*.4M00eN04,b.0001

0.A0ft00..ftuot.

.....NNfteuNftnnnmmee...poocoloofta4441..m...4pm...,c^-.41,11A0Ateocooe

etturtrImnsimelftetrimPt

12

'volt

Table 4 cant d

4

PRUGSAN TUTAL SWUMgltU N X Nita

366 LAN 4 4 6687 7

309 LOTTER, 1 1 100.0 I

370 Limon SCIONCE 1 11 100.0

N

Ati

tt127,1100.0100.0

T611

N RSSIA N S rerualt S

I. 6 I09.0O 66.7 0 6 MO1 1 100.0 1 I 100.01 106.0

1 1I 100.0 I 108,0 lipit! PSTONCLOGY 2 2 100.0

a 100.0

PUSLIC *M INS Ie SERVICE 7 0 71.4 / I Ma I "V*:$ ii.,0 :

1:::: 7 7

lii SOCIAL 4CIEWIS 3 3 100.0 4 4 100.0 3 2 100.0 r 1 1. a376 tommotreuttassy Motes s 1 100.0 s 1 too s4 Iso 0 00.371 DEGREE AVIONAL

1 1 100.09 0 00.9 9 9 100.0 10 9 90.0

4 1 100.0

I. 1 $00.1 100.0 1 1 100.0370 SEUD4ART scteNct 1

1100.0

SOO simusTals. GOUCATION 1 1 $ 100.01 1 100.0

1

1 100.0

1 100.0 , --1 1 100.0 A 400.0

21

3*.

BEST COPY AVAIIAL:-

a

22

Table 5

Florida Teacher Certification Examination Number and Percent PassingAll Subtests and Each Subtest by Program Statewide for

Vocational Technical Candidates

Augdat 1981; October 1981; and February 1982 Advinistrations

TOTAL TEST

READING

MATH'

PROF. EDUCATION

WRITING

422

369

379

354

361

$4t1

179

263

273

265

247

42Z

71%

72%

75%

68%

r

PRIX:flA

340 Lim20 LETTERS370 LIORARV SCIENCE373 PSVCNCLOGTS?$ PUBLIC APPOIRS 4 SERVICE270 SOCIAL OCIRACDS370 INTLROISCIPLIMAN7 STUDIES377 DES VOCATIONAL219 DECO, ADT SCIENCE009 /NOUS DIAL EDUCATION

24

Table 4 coned.

T Li VAL Sariifin9Li, 0 /1141g I

1141

Op ,iirjaX NI I IS II 111 If 9

6 III 66.7 7 *7.1 6 60.7 9 0 13.06 t LT:

111$ 1 1000 1 $ 10.0 1 1 10140 1 a leo.* s

I t 1110* 0 1 1 100.0 1 1 1000 1 1 100.0 1 1 SO 02 2 IMO* 0 2 I 1130*0 a 2 10000 I III WIWI II I 1100 0:

7 5 71.4 7 0 71 .4 7 5 71.4 V 0 05.7 7.3 3 100.0 100.0 3 2 10..0 6 100.0 3 I 11:14$ 1 atm.° 1 1 100.0 1 1 104.0 1 1 100.11 1 1 100.09 0 5e. 9 0 9 1 vi- 10 3 *0.0 9 9 10000 10 0.0.01 1 100.0 1 1 1 *0 1 1 100.0 1 1 100.0 1 1 1190.01 1 100. a I i a .0 1 1 100.0 1 1 10040 1 1 1110.0

EST COPY ii;,1;1:obt.

25

Table 5

Florida Teacher Certification Examination Number and Percent PassingAll Subtests and Each Subtext by Program Statewide for

Vocational Technical Candidates

August 1981; October 1981; and February 1982 Administrations

TOTAL # # PASSED 2 PASSED

TOTAL TEST' 422 179 42%

READING 369 263 712

MATH 379 273 722

PROF. EDUCATION 354 265 751

WRITING 361 , 247 682

PSTOKIIMIC CHARACTERISTICS

The psychpietric characteristics of validity, reliability, item

discriminattoas and contrasting Group Oerformence of the Florida Teacher

Certification Examination (FTCE) will be addressed in this section. Knowledge

of the psychometric characteristics of assessment tests is necessary for

evaluating the tests.

Validity

Validity refers to the relevance of inferences that are made from test

scores or other forms of assesseeit. The validity of a test can be defined as

the degree to which a test Impure* what it was intended to measure. Validity

is not an all -or -none characteristic, but a matter of degree. Validity is

needed to ensure the accuracy of information that is inferred from a test score.

Specific pries of validation techniques traditionally used to summarize

,educational and psychological test use -- criterion-related validity

(predictive and concurrent), content validity, and construct validity -- are

described in Standards for Educational and Psychological Measurement (APA, 1974,

pp. 26-31). For the FTCE, the primary validity issue that must be addressed is

the question of content validity. Content validity demonstrates that testbehaviors constitute a representative sample of behaviors in a desired

performance domain. The intended domain of the FTCE fa that of entry-levelskills as identified in the statute requiring the Examination as a basis for

certification. This statute (231.17, 17.13.) provides that

Beginning July 1, 1980 ... each applicant for initialcertification shall demonstrate, on a comprehensive writtenexamination and through such other procedures as may bespecified by the state board, mastery of those minimsessential generic and specialization competencies and othercriteria as shall be adopted into rules by the state board.

The statute addresses only the status at certification and does not require

that inferences be made from test scores to future success as a classroom

teacher. No claims have been made with regard to smasurement of specificaptitudes or traits, and no attempt has been made to establish relationships

between the lICE and independent concurrent or future criteria. It is only

claimed that the test adequately measures the skills for Which it was developed.

The construct and criterion-related validation approaches are not appropriate to

the validity issues related to development and use of the PTO.

The content validity of the.FTCE rests upon the procedures used to describe

and develop test items and content areas. The intended coverage of the test was

determined by a process involving professional consensus to (1) identify

competencies which should be demonstrated as a condition for certification, and

(2) identify subakills associated with each ximpetency. The procedures by which

the intended coverage was identified included surveys of the profession, reviews

15

I

by the Council on Teacher Education (COTt), reviews by the ad hoc COTE taskforce, and reviews by teachers and other professional personnel.

The general procedures used in test development were as follows:

1. The intended test coverage was identified and explicated. Competenciesand subskills associated with esch competency 'were identified andvalidated.

2. Teat item specifications were developed and validated.

3. Draft items were written according to test item specifications andpilot-tested on a small sample of senior students preparing to beteachers.

4. The final item review consisted of (a) a review by a specialpanel comprised of classroom teachers, teacher educators, andadministrators, and (h: item field-testing with seniors whowere in teacher etiolation programs. This was followed byanother review by Department of Education staff: Items weresubsequently placed in the item bank for future use.

5. Field-test data were reviewed by Department of Education staff. Items,

that did not perform well were deleted from the item bank or revisedand field.4ested again.

For the final item review process outlined in the fourth step, the itemswere divided by test area and reviewers were divided by area of expertise. Theprocess included a review of item content, group differences in performance, andtechnical quality. Bulletin IV (pp. 13-17) contains further information aboutthe development and re.71vtewest items.

In summary, the validity of the Examination has been well established as a.result of (1) the extensive involvement of education professionals in theidentification and explication of the necessary competencies and theirassociated subskills, (2) the precise item specifications which guided the itemwriters, and (3) the reviews of the items and the competencies/skills that theywere designed to measure.

Reliability of Test Scores

Reliability referp to the consistency betiveen two measures of the sameperformance domain. Althoqgh reliability does not ensure validity, it limitsthe extent to.which a test is valid for a particular purpose. The mainreliability consideration for the FTCE multiple-choice tests (Reading,Mathematics, and Professional &location) is the reliability of an individual'sscore. For the Writing test, a production writing sample, the reliabilityconsideration is the reliability of the judges' ratings. The data in thissection refer to the three FTCE administrations between October 1981 and July1982. For information about field test reliability data, refer to Bulletin IV(1981).

2816

!!141,1111LA1211161:stoimpatE

A test score is comprised of a "true" score ("domain" score) and an "error'

score. If an individual took several forms of a test, all constructed by

sampling from the defined item domain, scores of the various test forms would

not vary except as a result of random errors associated with item sampling

errors and changes within an individual from one test to another such as

attention, fatigue, or interest..

Reliability evidence is generally of two types: (a) internal consistency,

which is essential if items are viewed as a sample from a relatively homogeneous

universe; and (b) consistency over time, which is important for tests that are

used for repeated measurement. For the FTCE, toe primary reliability issue is

that of internal consistency. Sintir one form of the test is administered to

examinees at each administration, the reliability concern is that of consistency

of items within that particular teat (homogeneity of items). A test can be

regarded as composed ores many parallel tests as the test has items, and every

item is treated as parallel to the other items. In such a case, the appropriate

reliability index is the Kuder-Richardson Formula 20 (KR-20) index. The KR-20

formula it shown in Appendix B.

The KR-20 index estimates the internal-consistency reliability of a test

from statistical data on individual items. Separate KR-20 coefficients are

calculated for the Reading, Mathematics, and Professional Education subteets for

each FTCE administration. A high coefficient indicates that a test accurately

measures some characteristic of persons taking it and means that the individual

test items are highly correlated. The subtext KR-20 coefficients for the three

1981-1982 test administrations were above .78, indicating that the individual

test items were highly consistent measures of the three subject areas assessed.

Refer to Table 6. for the KR-20 coefficients.

Table 6

Kuder-Richardson Coefficients

Math ReadingProfessionalEducation

October 1981 .88 .87 .83

February 1982 .84 .85 .79

July 1982 .87 .87 .81

17 29

Reliability of the passing standards for the objective tests is estimatedwith the Brennan-Kane (B-K) Index of Dependability. This index is an estimateof the consistency of test scores in classifying examinees as masters ornonmasters of the minimal performance standards. The high B-K coefficients ofthe tests (refer to Table 7) indicate that-the candidates' scores are consistentwith their classification as masters or nonmaeters. Refer to Appendix B for thestatistical formula for the B-K Index.

TABLE 7

Brennan-Kane Indices

t... ProfessionalMath Reading Education

October 1981 .94 .94 .96

February 1982 .96 .94 .96

July 1982 .95 .94 .95

Reliability of.Bcoriotof the Writingpubtest

The major reliability consideration for the Writing test is theinter-judge reliability of ratings. The Writing test is a production writingsample that addresses one of two specific topics. The essays are ratedindependently by three judges with a referee to reconcile discrepant scores.Original reliability data were obtained from a study in which essays werewritten by 360 teacher educatioi students at two universities. Raters weretrained by the same procedures which are being used in the actual testadministrations. The reliability of the scoring process is monitored at theUniversity of Florida for each test administration. (Refer to Appendix D foradditional informatjon about the scoring of the Writing test.)

Two approaches are used to estimate the reliability. First, four indices

of inter-rater agreement are computed. These four indices are: (a) percentcomplete agreement; (b) average percent of two of the three raters agreeing;(c) average percent agreement by pairs as to pass/fail; and (d) percentcomplete agreement about pass/fail. The second approach for reliabilityestimation is the calculation of coefficient alpha for the raters and the ratingteam. This coefficient indicates the expected correlation between the ratingsof the team on this task and those of a hypothetical team of similarly comprisedand similarly trained raters doing the same task. Field test inter-raterreliability data and coeff.I.tient alpha for the inter-rater reliabilities arereported.in Tables 3.4 and 3.5 of Bulle.in IV (pp. 22-23). Refer to Table 8 forrater reliability data for the 1981-19 FTCE administrations.

3018

TABLE 8

Percentage of Rater Agreement for FTCEWriting Test

October1981

February1982

July1982

Index 1 - Z Complete Agreement 46.44 38.55 38.60

Index 2 - Average 2 Two of the ThreeRaters Agreeing 99.87 100.00 91.81

Index 3 - Average i Agreement byPairs as to Pass/Fail 98.10 96.75 96.72

Index 4 - Z Complete AgreementAbout Pass/Fail 97.15 95.12 95.08

Topic 1 .84 .87.

.82

Coefficient Alpha

Topic 2 .85 .85 .86

Examination of the reliability data for the Writing test indicates that the.

level of reliability achieved by the rating teams met acceptable Standards for

such ratings. '

Discrimination

Item analysis for the FTCE includes examination of the items' capacity to

differentiate between ability groups and the evaluation of response patterns to

the individual items. The item analysis indices used are item difficulty level,

item discrimination index, and point-biserial correlation coefficients.

Item difficulty level -- the percentage of examinees who answer each item

correctly -- is calculated for each item. These percentages provide important

information because items in the moderate range of difficulty differentiaterelatively more examinees from each other than do extremely easy or extremely

difficult items.

Related to the item difficulty level is the item discrimination index (see

page 32) which is the extent to which each item contributes to the total test in

terms of discriminating between the high and low achievers with regard

19

31

to the total test score. Any item that is below .20 on. this index is evaluated

for content and ambiguity of wording. Items that appear to be flawed arerevised or eliminated. The ranges for item difficulty level and correspondingitem discrimination indices are reported in Table 9.

The number and percent of examinees who select each alternative response(foil) were reported for each item in the multiple-choice tests. This foil

analysis permits further evaluation of response patterns to the individual itemsand provides useful information about variations in response performance bydifferent groups. These data are provided to the Department of Education staffand appropriate subcontractors and are not reported in this document.

Point-biserial correlation coefficients indicate the extent to whichexaminees with high test scores tend to answer an item correctly and those withlow test scores tend to miss an item. While the item discrimination index isbased on the performance of high and low achievers, the point-biserialcoefficient includes the entire range of scores in the correlation, therebyindicating the item-total correlation or the extent to which an item scorecorrelates with all other items measuring a particular subject area.tatistical formulas for these indices are listed in Appendix B.

32

20

IN .81-1.00

1 .61- .80

.60

g .21- .40

1! .0- .20

TOTAL

. 81-1.00

. 61- .80

.41- .60a

a a .21- .40

! .0- .20TOTAL

a

!

.81-1.00

.61- .80

.41- .60

.21- .40

.0- .20

TOTAL

.10 andbelow .11-.20

TABLE 9

Iftsquesty of Items within SpecificIts Difficulty and 18,am Dlacriminstion Sanas ,

.21-.30Item Discrimination Seaga

31-.40 .41-.30 .31-.60 .61-.70 .71-.00 .81 -.90 .91-1.00 TOTAL

--, 33 27 14 9 0 0 06 -

0 0

0 3 4 12A

2 0 0.....

0 0 o 4. 0 2 2 0 * 0 0 0

0 0 0 0 0 0 0 0A

0it

0

0 o o 0. 0 , 0 0 0 0

IS

.19 andbalm .11-.20

30

.21-.30

18 23

SHADINGIts Discrimination Bangs

.31-.40 .41-.30 .51-.60

9 2

.61-.70 .71-.80

0

.81-.90

ta28

4

00

0 130

.91,- 1.00 TOTAL

139 50 19 6 0 0 0 0 0 0

0 0 3 7 6 s

,IP

0 .0 0

1 0 0 1 1 2 0 0 0 o0 0 0 0 0 0 0 0

40 0

0...

0 0.

0 0. .

0 0 _.... 0

140

.10 mod

below

50

.11 -.20

22

.21-.30

14 7

PROFESSIONAL EDUCATIONItem DIsceledostion Banos

.31.40 .41 -.5O .51-.60 .61-.70

0

.71-.80

0

.81 -.90

0

.91-1.00.

32 42 28,

2lb-

0

,

0 0 0

.

0.

P, 1

2 16. I

24 27 1$ 1 0 0 0 0

3 1 17 '1/4, 18 4 1 0 0 0

1 1 2 ) 2.-

0 0 0 0P

0 . .0

- ° 0 1 0 0 0...

0 0.

0 , 0

60 72 49 19 2 0 0 0 0

214

21

S

0

0

240

TOTAL

104

85

44

6

1

240

Means and ranges for the point-biserial correlation coefficients for thethree 1981-1982 administrations are reported in Tables 10 and 11.

TABLE 10

Mean Point-BiserialCorrelation Coefficients

Math ReadingProfessionalEducation

October 1981 .38 .32 ..27

February 1982 .37 .31 .25

July 1982 .40 .34 .25

0

TABLE 11

Point-Biserial Correlation Coefficients BetweenCorrect Item Response and Subtext Score

1!

Range ofPoint- SiserialCoefficients Math Reading

ProfessionalEducation

.90-.99 0 0 0

.80-.89 0 0 0

.70-.79 0 0 0

.60-.69 1 0 0

.50-.59 18 1 0

.40-.49 38 55 14

.30-.39 42 93 64

.20-.29 23 78 100

.10-.19 8 12 50

Below .10 0 1 12

TOTAL ITEMS FOR 1981-82 130 240 2401/4

34 22

The appropriateness of mean point-biserial correlation coefficients must be

evaluated in the context of a particular testing progn"... According to A

Reader's Guide to Teat Analysis Reports (ETS, 1981), the mean biserial

correlation will be higher when the examinee group represents a wide range of

ability or knowledge or when the test items are very similar in content. Since

the FTCE Reading and Professional Education tests are relatively easy, the

scores were not greatly different. Thus, variability was reduced, and the

point -biserial correlation coefficients were attenuated.

Contrasting Group Performance

To the egLent that scores on a test refleCt group membership rather thanthe-knowledge pi. skill that the test is designed to measure, the test is

invalid. Although not all groups necessarily exhibit the same performance 1

in different areas of achievement, the procedure for analysing contrasting group

* performance is to screen for any specific areas or items. Extensive reviewprocedures were used during FTCE development to ensure that the Examination

content was an accurate representation of candidate perprmance in terms of the4

competencies being evaluated. The procedure included (a) a series of reviewsduring the item development stage to screen for possibly offeneive materials and

for it that might invalidate examinee performance and (b) statistical'

analysis of field groups, ethnic groups, and program groups. These procedures

are described in Bulletin IV (pp. 33-38).

After each FTCE administration, test content is examined for contrasting

group performance. Score distributions and summary statistics (including mean,_an, and'standard deviation of the distribution and an index of skewness) are

reported for each test.. The content review for contrasting group performance

includes (a) examination of scatterplots of performance on individual items and

overall content by sex and ethnic category (male-female, black- white,white-hispadic, and hispanic-black), (b) analysis of performance by groups based

on their test scores, and (c) individual item analysis by sex and ethnic

category to screen for items that may discriminate negatively for a specific

group.

Scatterplots

Scatter diagrams are graphic representations of the extent to whichperformance by two separate grop3 is related. Twelve scatterplots are produced

for each FTCE administration, comparing performance by sex and ethnic category

for each aubtest. Entries that depart from the general pattern indicate thatone group is performing differently from anothet group on specific items. In

such cases, entries that depart substantially from the general pattern of otherentries are reviewed for content that could account for differences inperformance level. Items that are determined to be flawed during this review

are revised or deleted from-the item pool. An example of a scatterplot is

illustrated by Figure 1.

2335

Figure 1. Scatterplot of Percent of Examinees Who Answered-Specific Items Correctly.a

peer sift ocoO1100L0T4 124.:,1 poor $$

mast vs fsmaLrMO 6000101. SlarAlfsim Mosr geopirsils

r1.00 *0.000011104100, isr =011 NIL.

V1.041 p.m, .16.00 11.1:°419 Mgt 0 600...........o.-....-............,....-............-.--.....-..............-......--..--......-----$-..-I.

new*1

100.40

10.001

141

I 1,0 0 810 0

11

0.00 :". I t 01.00m

1

00.001

". s:06peso 0010 MOP

i

041.010

111.00

1 *A

0160

I/

..... .6.0. .11

IWOW.10.00

100.00 1 10.00

0 0

10.611

10.00

10.00

0.0

'OM0.0 10.00 .00 20.00 00.00 61.011 11.00 10.11* 001.00 00.00 101.00

11...M..m.,.........T.................... ..... .m..elemamormwm.m.m.m....w.W+4...

10.00

r 10.00

aPerformance for males is plotted on the, vertical axis whileperformance for females is plotted on the horizontal axis.

Subtext Performance by Croups

10.61

0.0

The number and percentage of candidates who pass all subtests andindividual subtests are reported by sex and ethnic designation after each FTCEadministration. Table 1Z displays these data.

uadva tig3 uT peausavad u861 -I260 adenomas/alma 23/A sista gal aaai 'nap lusatudis *pm u; immense

16 %II 911 61 61 tZ 001 91 91 IL fra 1SC Minn 06 C01 91T 88 T1 9/ 96 gI 9I 59 66Z /iC MI And 99 66 9TT 88 IL Vt 001 91 91 CL 6c1 TLC MUM 99 96 9IT 6/ 61 9t 96 61 ST 11 111 CSC Mil 94 68 OH 95 CT 91 98 VT 91 is 00Z 996 142.1 191103

II 8 14:m 2 8 uti 2 8 101 2 8 IOL

impo 3010141 -WarinfiVr nudely

i

. arm ,mw/ uvaTassy

Et 466 0191. 96 9519 OM 96 0111 916/ 06 6061 LOU .56 5116 5C96 08111181

91 819 618 66 ICE9 01C9 16 LOLL 9E61 14 0861 1011 96 /9/6 1996 03 &MI 19 696 919 16 6009 CEEB 96 8011 C66/ 06 0061 60Ti C6 9006 1996 08148811

IS 1St flil 66 9161 61E9 16 9989 1(61 06 9061 RIZ 16 ILLS L996 8.1191

TO OCC 609 U 0651 50E9 18 0569 616/ 19 9891 TOTE 99 9019 9196 1981 39111111

B a sal s g los 3 9 sal $ 1 LOL 1 8 101

4**I6 swim eptelek r item emp1pie3 ITV

I

112111111100 90111-392IpAst

014$311111WG 4u432 Pus wog Ati Pu* 111302 d4 Ran

paw Moving /T11 fwd usaisa put iagnam

Z1 TO

Pi

item Analysis E rani c Category

Separate item analyses -- including item difficulty levels, itemdiscrimination indices, point-biserial correlations, foil analyses (alternativeresponse choices), and Kit-20 estimates of reliability -- are reported for eachsex and ethnic category. The item analysis process includes the screening ofthe individual test item= that may discriminate negativdly for a specific group.When an outlying entry is identified on a scatter diagram, the item content iscarefully reviewed to determine the necessity of deleting or revising the item.Foil analyses may also 'provide useful information with regard to contrastinggroup performance. Variations in response patterns by groups to different foils(alternative responses) may indicate the need for item revision.

The procedur-s described in this section -- including scatter diagrams, theanalysis of subtext performance by groups, and item analysis by sex and ethniccategory -- are used to ensure that scores obtained on the FTCE are accuraterepresentations of the candidates' performance levels in terms of thecompetencies that are addressed and Are not a reflection of membership in aspecific sex or ethnic category.

The Florida Teacher Certification Examination (FTCE) is an examinationbased upon eelected competencies that have been identified by Florida educatorsas minimal entry-level skills for prospective teachers. In order to develop theExamtnatiog, the following tasks had to be accomplishtd: (a) planning;(b) writing acrd validation of test items; (c) field-testing the examinationitems; (d) setting passing scores; and (e) preparing for test assembly,administration, and scoring. The competencies (described in Appendix A) havebeen adopted by the Board of Education as curricular requirements for teachereducation:programs in the colleges and universities in Florida.

The FTCE consists of three objective tests (Reading, Mophematicaq andProfessional Education) and an essay test (Writing) that is scored by trainedreaders. The general test content is as follows:

Test Content

Writing One of two general topics

Reading General education passagesderived from textbooks, journals,state publications

Mathematics Basic mathematics: simple computation,and "real world" problems

ProfessionalEducation General education including personal, social,

academic development, administrativelkills, exceptional student education

Developmental items are included in the Examination along with regular test items.These developmental items are not counted in computing an individual's score.

The psychometric characteristics of validity, reliability, itemdiscrimination, and contrasting group performance of the FTCE are described inthis report. The validity of the examination has been well established as aresult of (1) the extensive involvement of education professlonals in theidentifidation and explicatilpof the necessary competencies and theirassociated subskills, (2) the precise item specifications which guided the itemwriters, and (3) reviews of the items and the competencies /skills that they weredesigned to measure. The reliability data indicate that the test items areconsistent measures of the three subject areas and that the examinees' scoresare consistent with their classification as masters or nonmastera of the minimalperformance standards. The reliability data for the Writing test demonstratesthat the scoring by the writing teams meets acceptable standards of consistency.Item analyses for the FTCE examine the power of the items to differentiate

27 3i

between ability groups and evaluate response patterns to individual items. Theindices that are used to monitoethe differences between ability groups are itempercent correct, item discrtminatibn index, and point-biserial correlation

coefficients. Additional item analysis procedures -- including scatterdiagrams, the analysis of subtest performance by groups, and item analysis bysex and ethnic category -- are used. These procedures ensure that scores

obtained on the FTCE are accurate representations of the candidates' performancelevels in terms of the competencies that are addressed and are not a reflectionof membership in a specific sex ov ethnic category.

'The FTCE is administered three times a year in selected locations

throughout the state. Data from this report indicate that the percentage ofcandidates who passed the entire FTCE for the October 1981, February 1982, andJuly 1982 administrations were 84 percent, 84 percent and 85 percent,

respectively. Examinees who do not pass all of the tests at one administrationmay retake the tests not passed at later scheduled testing dates.

28

APPENDIX A

Essentjal Competencies and Subskills

M

1111 i1110 /l

plop I q

0111 11111imyi

111111111111111111118.11:11pLialii I itirls

1 I fa 1

P4:

li141 r 1 i

Hil i iidill! III

thin u

I ill111111111111 II 111 11111:11 I 1111.1 111 Ililli lim I

8 ai 6 4 d1 1 1 1 1

1 I 4 'a d 4 .. 41 Ai d 4 si C1/4)

a 4a

; p

I112.lot [I/ II it Pod! oto III till 51 14011

I, II I 1 HP' 11; I ih it

HiPeipilliiiipillittill 9.1:* Pciifigiiiii iiitirt

11111"4911billuillillillitilil

II 1 ii 1111h 11 ft" till I 1111.111111 11 11,11111 11111:11c4) cit ii 0 fil d olioFi ; 0

rg, 56 P F rill r- F pit p

:j. 1111 Mill il 1111111111111111/11111111 IIIIPN/11111/11 I

1 roillip it 'writ illihitil. odd of iiiiiii Oki 1

II 1110 0 II b ii Pit 1 ii 111111 .P. i! vti filtiii I

P

°MI

II.

41..1

11

1111

1111

1111

1/I

II I

"Pit

11.1

illp

!Hit

F

Itu

gdi

h. b

itr

17P

wP

aa

14

PP11

1111

/111

1111

11Pi

'ph 11

1410

II I

I II

Nip

lit!I

II 1

/11

11

4.1

APPENDIX B

Mathematical Illustrations of Formulas

The following formulas were used in the calculation of statistics for the

FTCE:

(a) Point- Biserial Correlation

m - ms u

rpb a [ ]

where point biserial correlationcoefficient

ms

= mean total score of examineesanswering item right

mu

= mean total score of examineesanswering item wrong

a = standard. deviation of total scorefor entire group

p = proportion of examinees gettingitem right

(b) Kuder-Richardson Formula 20 Reliabilit1 Coefficient

where I = number of items/questions (anythomitted questions not includ d)

EpqKR 20 i- 1[1 -a2 2 3

y

(c) Standard Error of Measurement

p = P roportion of examinees gett g

item right

q = 1 p

a 200 variance of the total score

where as

= standard error of measurement

ax

- the standard deviation of totalas .ux r

XXscores

rxx

= the reliability coefficient

46

(d) Item Discrimination Index

(e) Coefficient Alpha

where Rum number of students in high score

range (i.e., the upper 27%) whoanswered the item correctly

R - number of students in low scorerange (i.e., the lower 27%) whoanswered the item correctly

T - total number in the upper andlower groupp

Coefficient alpha is used as an estimate of the inter-rater reliabilityof Writing test scores. This coefficient indicates the expected correlationbetween the ratings of the team on this task and those of a hypothetical teamof tmilarly comprised trained raters doing the same task.

krkk k - 1

[ 1a

2

Ea 200 sum of the variances of each item

where r coefficient of reliability (alpha)kk

2

k * number of test items

2a 10 variance of the examinees' totaltest score

(f) Brennan-Kane Reliability

1XpI(1 - XpI) S2(Xpi)

mac) -n 1

C)2 +S2 ('I)

where n * number of items

Xpi SAD grand mean over n persons and n

items

$2

(31.ri

- sample variance of persons' meanscores over items

that is, SS persons

n* n

i

35 47

APPOIDIX C

Security and Quality Control Procedures

4837

SECURITY

A security and quality control plan has been developed and implemented foithe program. Components of the security plan include:

1. Controlled, limited access to,all examination materials;2. Shredding of developmental materials and used booklets;3. Strict accounting of all materials and of persons working with

test items at the testing agency and test centers.

A signed security document is obtained from every individual who has accessto the examination materials. The security document contains an agreement thatthe individual will not reveal --f in any manner to any other individuals -- theexamination items, paraphrases of\the examination items, or close approximationsto the examination items. Only persona who have a "need to see" the itemsbecause of their work on the project are allowed to view any parts of theexamination.

Test Security_Durisg the Administration

During the production phases of this project. all typing and reproduction aredone by persons who have security clearances. All materials are signed out whenthey are removed from locked storage and checked in when they are returned. Oneperson is assigned responsibility for the secure files while all work is inprocess;'this person is able to account for all materials. at all times.Material that needs to be revised and unusable materials are not placed inwastebaskets but are kept in a locked file for special destruction.

1%4

The following plan has been implemented to ensure rigorous security of allmaterials during actual examination administration. Materials remain in securestorage at the test centers until the morning of the test date. If multiplerooms are used at a center, each room is assigned blocks of materials that mustbe signed foi by a room supervisor, the only person who has access to the roomsupply. Test books and materials are never left unguarded. Candidates areassigned seats by center personnel. The seating arrangements minimize thepossibility of a candidate seeing the papers of other candidates. Books aredistributed by the room supervisor and proctors. Each booklet is handed to theexaminees individually and the examinees sign a receipt for the booklets byserial number. Immediately after distribution, an inventory is taken to ensurethat the sum of the distributed and unused books equals the number of booksassigned to the testing room. Any discrepancy is reported to the centersupervisor and immediate steps are taken to reconcile the discrepancy and locatethe missing material. Every such incident is.reported to the Project Manager,and appropriate action is instituted to prevent further occurrences and torecover any missing materials.

Candidates cannot leave the room during a test session except for anemergency. If a candidate must leave the room, materials are delivered to theroom supervisor or pr)ctor and held until the candidate's return. No materials

49

38

may be removed from the test room at any time. Only one candidate may leave

room at a time. Provision of a break between eubtests reduces the need for

candidates to leave during a test session. At the end of a session all test

books are collected and accounted for.before collection of the answer documents.

After the answer documents are accounted for, all candidates in the room are

dismissed. Upon return to their original seats, candidates are reidentif ied by

test administration personnel before the distribution of materials for the next

subtest. During breaks and the lunch period, all materials are either locked in

secure storage or are placed under direct supervision of test administration

personnel. All used and unused materials are returned to locked storageimmediately after test administration.

Quality Control

To ensure quality control during the scoring and reporting process, the

following procedures are used:

1. Each answer document is checked for proper coding and marking in

response areas;

2. Computer edit programs are used to check for valid program codes

on the registration forms and for matching names and social security

numbers on the registration and scoring files;

3. Test data are used to verify the accuracy of all scoring and reporting

programs;

4. Sample data are drawn prior to scoring from each administration toscreen for key, printing, or procedural errors;

5. Random answer documents are hand-scored during the scanning process toverify proper operation of the scanner;

6. A complete review of all procedures -- which includes hand-checking a

sample of test data -- is completed by members of the University of

Florida, Office of Instructional Resources and the Department of

Education before printing the candidate score reports;

7. Analyses of the holistic scoring process are conducted. This review

addresses the overall reliability of the ratings, the distribution of

scores, and number of refereed scores for each reader. Specific

procAures for quality control during the holistic scoring process aredocumented in the Procedural Manual for Holistic Scoring;

8. The accuracy of the calculations for the institutional and state

reports are hand-verified.

APPENDIX D

Scoring the Writing Examination

41 51

HOLISTIC SCORING OF THE WRITING SUSTESTOF THE FLORIDA TEACHER CERTIFICATION EXAMINATION

The Writing Subtext.

The Writing Subtext was designed to assess a candidate's ability to writein a logical, easily understood style with appropriate grammar and sentencestructure. The subskills to be measured are:

a. Uses language atthe level appropriate to the topic and reader;

b. Comprehends and applies basic mechanics of writing: spelling,

capitalization, and punctuation;c. Comprehends and applies appropriate sentence structure;d. Comprehends and applies basic techniqpes for the organization of

written material;e. Comprehends and applies standard English usage in written

communication.

The candidate is given a choice between two topics on which to write an

essay during the 45-minute examination period. This essay should demonstrate

the competency and subskills specified above. The essay or writing sample is

scored holistically by at least three trained and experienced judges.

The Process of Holistic Scoring

Holistic Scoring Defined

Holistic scoring or evaluation is a process for judging the quality of

writing samples. It has been used for many years by professional testingagencies for credit-by-examination, state assessment and teacher certificatton

programs.

Essays are scored holistically, that is for the total, overall impressionthey make on the reader, rather than for an analysis of specific features of a

piece of writing. Holistic scoring assumes the skills which make up the ability

to write are closely interrelated and that one skill cannot be separated from

the others. Thus, the writing is viewed as a total work in which the whole is .

something more than the sum of the parts. A reader reads a writing sample

quickly, once. He or she obtains an impression of its overall quality and thenassigns a numerical rating to the paper based on judgments of how well it meets

a particular set of established standards.

The Reader

The key to effectiveness of the holistic scoring process is the readers who

must make valid and reliable judgments. Readers must bring to the process

experience in teaching and grading English compositions. In addition, they must

5242

be willing to undergo training in holistic scoring which demands they set aside

personal standards for judging the quality of a writing sample and adhere to

standards which have beer set for the examination. The goal for the reading of

the Writing Subtest of the Florida Teacher Certification Examination is to rate

a large number of essays according to their overall competence in a consistent

or reliable manner according to previously established standards based on a set

of defined criteria.' By undergoing a set of trainitg procedures a group of

experienced teachers of composition can develop a high level of consistency in

making judgments about the quality of a group of essays.

The Criteria

The criteria established to score the essays for the F;prida Teacher

Certification Examination are listed below. They were developed to accommodate

specific conditions imposed by the Writing Subtest:

(1) They reflect those characteristics widely accepted as indicative of

good writing;

(2) They can be translated into operational descriptions of levels of

competence;(3) They reflect the general competency statement and subskills

identified by the Council on Teacher Education.

Specific Criteria for Evaluation of Essays

1. Rhetorical quality

1.1 Unity: An ordering and interdependence of parts producing a single

Affect: completeness.

1.2 Focus: Concentration of a topic; the presence of a "center of

gravity."

1.3 Clarity: Lucidity of expression; lack of ambiguity and distortion.

1.4 Sufficiency: Appropriate depth and breadth or expression to meetthe writer's purposes and the demands of the particular topic.

2. Structural and Mechanical Quality

2.1 Organization: Consistent and coherent integration and connection of

parts.

2.2 Development: Appropriate and sufficient exposition of ideas; use of

dftail, examples, illustration, comparisons, etc.

2,3 Paragraph and Sentence Structure: Appropriate form, variety, logic,

relatedness of and among structural units.

2.4 Syntax: Appropriate ordering of words to convey intended meaning.

43

3. Observance of Conventions in Writing

3.1 Usage: Appropriate use of. language features: inflections, tense,agreement, pronouns, modifiers, vocabulary, level of discourse, etc.

3.2 Spelling, Capitalization, Punctuation: Consistent practice ofaccepted forms.

The relationship between the subskills and the scoring criteria is .

illustrated in the figure below.

ESSENTIAL COMPETENCIES:Demonstrate the ability towrite in a logical, easilyunderstood style withappropriate grammar andsentence structure.

a. Use language appropriateto the topic and reader.

b. Apply basic mechanics ofwriting.

c. Apply appropriate sen-tence structure.

d. Apply basic techniques fororganization.

e. Apply standard Englishusage.

Operational Descriptions

RHETORICAL STRUCTURAL CONVENTIONAL

a;.. o 4.0U .01 aa CO

W U:NY **4 N 1 g .144.1 C.)

"' 194 t 40

>' CO 44 v.41 W W.1.1 0 $4 NM

gb V 4 " 00 4*ri 0 CO 44 a Ua u4.

0 r 0 8ce,

0..1 C en ..1C

oflo 1 a .44,

er!e e

re,1 gr. e....1 wi C4 csi 0.1 CSC fn In

X X X

X

111111M11111

R x

X x

The operational descriptions based on the scoring criteria reflect the fourlevels of competency which the readers are to assign each of the essays theyread. Each reader will independently score or rate a paper on a scale of 1 to4, with 4 being the highest rating. The descriptions which follow are anattempt to express clearly and precisely the general, overall impressions areader has in terms of the criteria when he or she reads essays of varying

44

quality. The four levels or quality of competence could be expanded or

decreased. However, for the task of scoring the Writing Subtest, it providesenough degrees of distinction to be meaningful yet manageable for largescale testing.

4. The essay is unified, sharply focussed, and distinctively effective.It treats the topic clearly, completely, and in suitable depth andbreadth. It is clearly and fully organized, and it develops ideas withconsistent appropriateness and thoroughness. The essay reveals anunquestionably firm command of,paragraph and sentence structure.Syntactically, it is smooth and often elegant. Usage is uniformly

sensible, accurate, and sure. There are very few, if any, errorsin spelling, capitalization, and punctuation.

3. The essay is focussed and unified, and it is clearly if notdistinctively written. It gives the topic an adequate though notalways thorough treatment. The essay is well organized, and much ofthe time. it develops ideas appropriately and sufficiently. It slants a

good grasp of paragraph and sentence structure, and its usage isgenerally accurate. and sensible. Syntactically, it is clear and

reliable. There may be a few errors in, elling, capitalization, andpunctuation, but they are not'serious.

2. The essay has some degree of unity and focus, but each could be

improved. It is reasonably clear, though not invariably so, and ittreats the topic with a marginal degree of sufficiency. The essayreflects some concern for organization and for some development ofideas, but neither is necessarily consistent nor fully realized.'The essay reveals some sense, if not full command, of paragraph andsentence structure. It is syntactically bland and, at times,

awkward. Usage is generally accurate, if not consistently so. Thereare some errors in spelling, capitalization, and punctuation that

detract from the essay's effect if not frOm its sense.

1. The essay lacks unity and focus. It its distorted and/or ambiguous,

and it fails to treat the topic ,in sufficient depth and breadth.There is little or no discernible organization and only scantdevelopment of ideas, if any at all. The essay betrays onlysporadically a sense of paragraph and sentence structure, and it issyntactically slipshod. Usage is irregular and often questionable orwrong. There are serious errors in spelling, capitalization, andpunctuatidtt.

Traininj of Readers

The training of readers for the Writing Subtest of the Florida Teachers

Certification Examination consists of three steps:

45

Step 1: Acquiring information about the examination and holistic scoring

process.

Step 2: Reading and scoring essays which have been selected as good examplesof the various levels of competence in writing. The practice essayshave been scored by'experienced readers and annotated in accordancewith the operational descriptions. By reading, scoring anddiscussing the essays, the readers practice until they consistentlygive the same ratings to essays as the experienced readers.

Step Reading and scoring a sample of the actual Writing Subtests whichhave been selected and scored prior to the training session. Thesesamples will serve as the standards for the scoring of theexamination end will include essays which represent each of thecompetency levels. Asiin Step 2, the emphabis will be on each 'readerto assign scores which agree with those established earlier by theexperienced readers. This step occurs'imeediately before the actualscoring session and often is repeated during the session to ensurecontinued consistency or reliability of assigned scores or ratings.

Setting the Standards

Prior to Step 3 in the training, standards for the Writing Subtest areestablished. The Chief Reader, who is responsible fdr conducting the holistiLscoring, and his assistants, the Assistant Chief Reader and the Table Leaders,select, at random,'a sample of papers from the total group of essays written ona particular topic. These papers are read and accred independently by eachperson. Results are compared and consensus is reached for the identification offour papers. Each becomes.a standard for one of the four competency levels.Additional papers are chosen to be used in Step 3 of the training procedures.This process is repeated for the second topic of the Writing Subtest.

The Scoring Session

The scoring session begins immediately after Step 3. Readers ire assignedto tables in groups of four or five. The number of readers and the number oftables are determined by the number of essays to be scored. Each table of

readers is also assigned a Table Leader. The Table Leader's primary t,..alt is to

continually monitor the scoring process and consult with readers as questions or"problem" papers arise. The Table Leader is an experienced reader who hashelped set the standards.

Each reader is given a set of papers to read, rate and mark the score. Theidentity of the writer is not known to the reader. The papers range, on theaverage, from 200 to 400 words in length, and each can be read and scoredholistically in approximately two minutes. As the scoringof a set of papers iscompleted by a reader, a clerk collects and returns the piper to operationtable. The scores given by the reader ure covered, and the papers are

5b46

redistributed to another set of folders and delivered by the clerk to a secondtable of readers. This procedure continues until each paper has been read bythree different readers. Each reader reads, judges and scores at his own pace.Scoring sessions are approximately three hours long, with ten minute breaks eachhour. Usually there are two scoring sessions for each day of holistic reading.

After a paper has been scored by three different readers, the scores areexamined at the operations table. If one of the scores varies from any other bytwo levels or more (ex. 3-3-1), the paper is sent.to the Chief Reader orAviistant Chief Reader who serves as referee. This'person assigns a rating

which replaces the discrepant score. Papers whose original ratings are 1-2-3 or2-3-4 are refereed and scored as follows:

Rating of 1-2-3

(a) A referee rating of 1 will replace the 3, resulting in a score of 4(b) A referee rating of 2 will replace the 1, resulting in a score of 7(c) A referee rating of 3 will replace the 1, resulting in a score of 8

Rating of 2-3-4

(a) A referee rating of 2 will replace the 4, resulting in a score of 7

(b) A referee rating of 4 will replace the 2, resulting in a score of 11(c) A referee rating of 3 will replace the 2, resulting in a score of 10

All initial scores of S will be refereed. If any paper is refereed and adiscrepancy still occurs, the essay is submitted to a new team of readers until.consistency is obtained.

The three scores are then added together for a total score. Thus the

lowest score possible is a 3, the highest, 12.

Final Steps

After the reading sessions are completed, Table Leaders evaluate theperformance of Readers. The Chief Reader evaluates the Table Leaders. Readers

are asked for comments and suggestions for improving training and scoring .

procedures.

Two approaches for reliability estimation are the percentage of rateragreement and the calculation of coefficient alpha for the raters and the ratingteam, which indicates the expected correlation between the ratings of the teamand those of a hypothetical team of similarly comprised and similarly trainedraters doing the same task. The four indices that represent rater agreementare: (a) percent complete agreement; (b) average percent-of two of the threeraters agreeing; (c) average percent agreement by pairs as to pass/fail; and(d) percent complete agreement about pass/fail. These data are reported inTable 8 of this report.