analysis of the psychometric properties of the student ...iea.fau.edu/inst/spot04.pdf · an...

Analysis of the Psychometric Properties of the Student Perception of

Teaching Instrument

(SPOT)

Office of Institutional Effectiveness and Analysis March, 2004

An Analysis of the Psychometric Properties of the Student Perception of Teaching Instrument

(SPOT)

Forms completed between Fall 1999 and Summer 2003

(N=423,543) INTRODUCTION Good measurement instruments are regularly evaluated for validity, reliability and bias. The present version of the Student Perception of Teaching, in use since fall 1999, resulted from an analysis that demonstrated that the previous form was unidimensional, with demonstrably ambiguous wording and vague descriptions of behavioral attributes. Although it was intended that the new instrument, called SPOT, would be evaluated after one year of use, almost five have elapsed since its inception. The delay is advantageous in that it provides a larger database for analysis, almost 424,000 completed surveys. Following is a description of the analyses undertaken to evaluate several psychometric properties of the SPOT. The instrument, along with descriptive statistics on the results from four years of administration, are attached as Tables 1 – 3. A review of this data will help the reader better understand the subsequent analyses. VALIDITY Dimensionality How many components of effective teaching are being measured? Are all 29 items necessary to measure the constructs? Are any of the items superfluous because they do not add information beyond what is provided by other items? Analysis [Tables 4 and 5] Since good teaching is multidimensional and no single set of items can measure all aspects of teaching, it is expected that factor analytic studies of teaching evaluation instruments will uncover distinct components of effective teaching. Factors commonly found include course organization, caring about students, communications skills, course difficulty, grading, and self-rated learning. Research into student ratings have defined as many as 22 different logical dimensions, although the norm in a single instrument is closer to six. Scale items on the SPOT instrument had high degrees of correlation. Among items 1-23 nearly all items had correlations of .7 or above, and most were in the .8 to .9 range. Availability during office hours and appointment times (item 16) and desire to learn more about a subject (item 22) had lower correlations with the other scale items, typically in the .5 to .6 range. Item 15 had a smaller, negative correlation with other scale items, with coefficients in the .2 to .3 range.

Principal components analysis found three dimensions measured by the 29 items on the SPOT instrument. Twenty-three of the 29 items on the SPOT form contribute to the first component extracted, representing a “general instructional effectiveness” component. The second component extracted consists of four items, representing a “difficulty/pace/effort” dimension of teaching. The two remaining items involve expected grade and a student’s desire to learn more about the course subject, representing a course related “subject interest/aptitude” component. Redundant items: Correlations coefficients for items 1-8 (which were required by the Board of Regents and are currently made public on the web) show high redundancy with counterparts in the second set, items 9-29. The following items, in particular, are duplicated: Item 1(Description of course objectives and assignments) with item 9 (Description of course objectives); Item 3 (Expression of expectation for performance in this class) and 19 (Told students how they would be evaluated); Item 4 (Availability to assist students in or out of class) and 16 (Was available during office hours or appointment times); Item 5 (Respect and concern for students) with items 13 (Seemed to know when students did not understand the material), 14 (Seemed concerned with whether students learned), and 18 (Was willing to listed to students’ questions and opinions); Item 7 (Facilitation of learning) and 13 (Seemed to know when students did not understand the material), 14 (Seemed concerned with whether students learned) and 20 (Gave assignments that assisted in learning the material); Item 8 (Overall rating of instructor) and 23 (I would like to take another course with this instructor) and 29(Rating compared to other instructors). RELIABILITY Consistency Within a given class, do the students tend to give similar ratings on a given item? Analysis [Table 6a – 6c] Reliable instruments are those which measure something consistently. Reliability in this context is most appropriately assessed by comparing the variation in responses within classes to variations across classes. It is assumed that different students experiencing the same instruction should agree more on a given rating item than all students rating that item across all classes (of the same size). The larger the intraclass correlation coefficient, the more differentiation there is among classes relative to that among raters within classes. Coefficients of .70 or larger indicate adequate reliability; lower correlations indicate more variability within classes than we are willing to accept. Using this standard, classes of ten or more produce reliable ratings on all items except several where

individual student characteristics might be expected to influence ratings; i.e., “Expected too much work for the number of credits” or “ How much effort have you put into this course?” The “Overall rating” item showed good reliability for all levels even at class sizes = 5. Generalizability To what extent do the evaluations in a particular course or courses represent the instructor’s general teaching effectiveness? What is the primary determinant of the SPOT results – the instructor or the course? Analysis [Table 7] When results of student ratings are used in personnel decisions, it is important to know whether the evaluations in a particular course or courses represent the instructor’s general teaching effectiveness. “Generalizability” analysis correlates the average ratings from pairs of classes under each of three conditions. The first condition, correlating the ratings from two sections of the same course with the same instructor in two successive terms, provides a measure of interrater reliability for course-instructor combinations of classes. It is expected that these correlations would be highest, since only the term is different. The second condition, correlating two sections of different courses with the same instructor in the same term, isolates the instructor effect. The third condition, correlating two sections of the same course with different instructors in the same term, isolates the course effect. Instructor-related correlations were higher for the same instructor, even when teaching a different course. Course-related correlations averaged only .20, although the two course-related items (22 and 24) showed moderate rater agreement, but not as much as with the same instructor. Thus, the primary determinant of the SPOT results is the instructor, not the course. The instructor also influences how students perceive the interest and difficulty level of the subject matter. Threats to reliability Are the students careful in completing the forms? Analysis For items 1-8, 43% of all respondents rated each item identically, indicating many students did not take time to read and consider their response to each of the individual items. For items 9-21, almost 20% of all respondents rated each item identically, indicating these students did not take time to read and consider their response to each of the individual items. Item 15, “Expected too much work for the number of credits” was deliberately inserted into this item group in order to detect “response set.” It was anticipated that students who rated all other items in the set favorably, for example,

would not be likely to completely agree that the instructor expected too much work. Responses to item 15 were checked against item 27, “How much effort have you put into this course?” assuming that students who completely agreed with item 15 would indicate that they put more effort into the course. Although the correlation between the items was positive, the association between them was low (r=.25). Lower level students had the smallest percentage of respondents rating these items identically, while upper level students had the highest percentage. Graduate students fell in the middle. Students in courses in Fine Arts, Nursing and Education had the highest percentage of respondents rating all items identically, while students in Architecture and Engineering courses had the smallest percentage. SOURCES OF BIAS Are there factors apart from teaching performance that influence the ratings that students give? These variables might include class size, class level, discipline, workload/difficulty of the course, expected grade, instructional delivery method, or whether the class was taken in a summer or regular term. Analysis [Table 8] In order to isolate factors influencing student ratings, we conducted a series of multiple regression analyses, predicting the overall rating of instructor (item 8), the effectiveness of this instructor compared to others (item 29), and the willingness to take another course with this instructor (item 23) from variables suspected to influence ratings. Variables entered into the model included the items on the SPOT not identified as dependent variables; section type (lecture, lab or other); course level (lower, upper or graduate); discipline (disciplines were defined using CIPcodes and may cross several colleges); whether the course was taught in the summer vs. a long term; the class response rate; and class size. A difficulty/pace/effort variable was computed by averaging items 24, 25 and 27. Preliminary data analysis indicated that the relationship between this variable and the dependents may be curvilinear, and so a quadratic form of the variable was included. Cross products between disciplines and the difficulty/pace effort variable, both linear and quadratic forms, were computed and entered into the model to test for interaction effects. The regression models predicting the dependent variables accounted for 86% to 93% of the variance in those items. The best predictors of these variables were the items related to student learning (6, 7, 17, 20, 22, 28). Once student learning was taken into account, items pertaining to “caring” and “organization” contributed most of the rest of the effects on the dependent variables. The difficulty/pace/effort items had a curvilinear relationship with the dependent variables; instructor ratings were highest when the course difficulty/pace/effort was “just right”, and declined as the difficulty index moved in either direction, to very difficult/fast or to very easy/slow.

Section type had a small influence on overall rating of instructor and the desire to take another course with that instructor, with lab courses being rated slightly higher than lectures. Students in lower division courses rated their instructors slightly higher and more effective than those in upper division courses. Among the disciplines, students taking courses in architecture, engineering and physical sciences were more likely to want to take another course with the instructor, and students in education less likely. Results for rating the effectiveness of the instructor were similar among disciplines. Smaller classes tended to rate the instructor as slightly less effective, after all other effects were entered into the model. IV. OTHER ANALYSIS How do students’ expected grades compare with grades actually given in the course? Do differences between expectation and reality differ by discipline? By level? Analysis [Table 9} On average, raters expected to receive high grades than they actually did. The difference was greatest at the lower level (.42) and smallest at the graduate level (.07). Overall, the reality gap was greatest for courses in the natural sciences, humanities and social sciences, and lowest for education and nursing. UNEXAMINED PROPERTIES “Validity” is the gold standard for measurement. If an instrument does not accurately measure what it purports to measure, it doesn’t matter that it is reliable and unbiased. These are some aspects of validity that have not yet been examined for the SPOT: Construct validity Is there a relationship between the content of the SPOT and the constructs intended to be measured? Do the items represent characteristics of effective instruction? The items on the SPOT should reflect a view of instruction that is supported by the institution as a whole. Concurrent validity Do the SPOT results agree with other evidence of teaching effectiveness, such as alumni ratings, peer ratings, self-ratings, chair or dean ratings, course outlines, teaching portfolios? Theoretically, the best criterion of effective teaching is student learning. Reliability can also be compromised by item wording. A critical analysis of the adequacy of item wording was not done as part of this study. Are the SPOT items clearly written and unambiguous? Do they describe instructor behaviors specifically enough for students to use their full powers of discrimination?

SUMMARY AND CONCLUSIONS

We were disappointed to find that the unique dimensions of effective instruction that the new SPOT form was intended to measure are not evident. Nearly all of the items on the SPOT form contribute to a “general” measure of instructional effectiveness; four items contribute to a dimension of course difficulty/pace & required effort while the remaining two items point to a measure of subject interest/aptitude. Thus, despite the use of a re-designed form, student ratings of instruction remain largely unidimensional. Because the wording of our items closely parallel those of proven multidimensional instruments like the IDEA, SIR II and SEEQ, we suspect that it may be the behavior of the raters using the form, rather than its content, that contributes to this result. The extent of “response set”, or students answering the same item identically down the page, illustrates the problem. Possible explanations include students not being provided adequate time to consider their ratings, or students not seeing the purpose and benefit of providing thoughtful evaluations of instruction. Possibly, administering the same form to every class in every term has a mind-numbing effect. Part of the mind-numbing may come from the duplication of item content; items 1 – 8 are holdovers from the former Board of Regents required public items. All of these items have counterparts in the second set of questions, rendering them superfluous. The reliability of student rating instruments is not disputed. The SPOT instrument continues to exhibit strong reliability, especially in class sizes of ten or more. Ratings from smaller classes should be interpreted more cautiously. Analysis confirms that it is the instructor and not the course that is being evaluated. Generally, it is recommended that the lower the correlation for a particular item, the more courses should be used when making judgments that affect a personnel decision. Faculty often have concerns about the factors other than teaching effectiveness that might influence the ratings that students give. Our analysis confirms a large body of research that has repeatedly shown that many factors assumed to be potential biases do not significantly affect the overall ratings of instructors. Most of the variation in overall rating of instructor or instructor effectiveness is attributable to the items comprising the “general effectiveness” component resulting from the factor analysis. Although we divided these items into learning, caring, organization and communication groupings based on the concepts they address, these groupings did not differ statistically. If students exhibit any bias toward instructors for difficulty or workload, it is to penalize those instructors whose courses are at either extreme of difficulty, pace or effort. We see little evidence that ratings need to be interpreted in the context of course level, section type, class size or discipline, since these factors contributed very little to explaining the variance in overall ratings. Finally, although there was a low to moderate positive correlation between the instructor’s overall rating

and the expected grade, this is attributable to the lack of variability in both items; 77% rate their instructors as Very Good or Excellent, and 86% expect to earn an A or B in the course (although they won’t!). The expected grade is not a significant predictor of ratings. The small number of significant predictors for Item 8 underscore its probable “halo” effect as an outcome measure. The “overall” wording allows the rater to fall back on his or her general liking of the instructor, or whatever criteria might come to mind at the moment. Items 23 and 29 show better separation of predictors because they ask more specific questions. “Global” rating items are discouraged by research literature on student evaluations. If a global question is needed, it would be preferable to word it in terms of some explicitly stated outcome, such as the one used by SIRS II which asks students to rate the quality of instruction as it contributed to their learning in the course (very effective to ineffective). We suggest that future efforts to improve the usefulness of the evaluation of instruction include consideration of the following:

- Review the current form for construct validity. Revise to address the inadequacies identified by this study (i.e., disposing of items 1-8; constructing more specific “global” outcome items, addressing use of technologies and other dimensions of teaching not now captured).

- Teach students to become more sophisticated evaluators; make the experience of

completing the form an educational one;

- Promote the importance of the SPOT. Tell students what it is used for, how faculty members can use it to improve their courses.

- Consider alternate forms of assessment (i.e., open-ended questions) so that students

don’t fill out the same form in every class, every term.

- Investigate other methods of evaluating faculty performance, such as peer review of teaching portfolios.

- Encourage and train faculty to use classroom assessment techniques to get more

frequent feedback from students on their effectiveness.

- Support research into how students arrive at their conclusions of instructor’s effectiveness, and how faculty use information from the ratings forms.

Table 4 Correlation of Item Ratings, Student Perception of Teaching

Item 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 281 1.002 0.92 1.003 0.93 0.93 1.004 0.83 0.85 0.86 1.005 0.82 0.85 0.86 0.89 1.006 0.86 0.92 0.89 0.85 0.87 1.007 0.90 0.94 0.92 0.87 0.88 0.94 1.008 0.91 0.95 0.93 0.89 0.91 0.94 0.96 1.009 0.90 0.86 0.88 0.77 0.77 0.80 0.84 0.85 1.00

10 0.84 0.81 0.82 0.74 0.73 0.75 0.80 0.82 0.89 1.0011 0.78 0.80 0.78 0.72 0.71 0.76 0.79 0.81 0.80 0.84 1.0012 0.78 0.81 0.79 0.78 0.81 0.84 0.82 0.84 0.78 0.75 0.74 1.0013 0.83 0.88 0.86 0.82 0.85 0.86 0.89 0.90 0.81 0.76 0.75 0.80 1.0014 0.80 0.85 0.84 0.84 0.88 0.86 0.87 0.88 0.79 0.75 0.73 0.86 0.89 1.0015 -0.31 -0.31 -0.31 -0.28 -0.32 -0.29 -0.32 -0.31 -0.30 -0.26 -0.25 -0.25 -0.32 -0.30 1.0016 0.66 0.65 0.67 0.81 0.70 0.65 0.68 0.69 0.66 0.65 0.63 0.67 0.65 0.69 -0.21 1.0017 0.74 0.78 0.78 0.77 0.78 0.81 0.79 0.81 0.72 0.69 0.67 0.76 0.79 0.81 -0.23 0.67 1.0018 0.75 0.79 0.78 0.80 0.87 0.79 0.80 0.84 0.76 0.72 0.70 0.79 0.83 0.85 -0.31 0.68 0.80 1.0019 0.81 0.77 0.83 0.73 0.74 0.74 0.77 0.79 0.85 0.79 0.72 0.72 0.76 0.75 -0.28 0.64 0.69 0.74 1.0020 0.81 0.81 0.82 0.76 0.76 0.79 0.82 0.83 0.81 0.79 0.75 0.73 0.80 0.79 -0.29 0.64 0.74 0.74 0.78 1.0021 0.74 0.74 0.75 0.68 0.69 0.70 0.75 0.75 0.75 0.74 0.69 0.66 0.73 0.71 -0.29 0.59 0.64 0.67 0.72 0.79 1.0022 0.62 0.64 0.62 0.57 0.59 0.69 0.65 0.64 0.55 0.50 0.47 0.55 0.62 0.61 -0.26 0.44 0.60 0.54 0.52 0.58 0.51 1.0023 0.84 0.89 0.87 0.82 0.86 0.88 0.89 0.93 0.81 0.76 0.76 0.80 0.87 0.86 -0.35 0.65 0.79 0.82 0.76 0.81 0.74 0.69 1.0024 -0.26 -0.29 -0.29 -0.23 -0.29 -0.25 -0.30 -0.28 -0.24 -0.22 -0.17 -0.19 -0.36 -0.28 0.42 -0.15 -0.18 -0.26 -0.23 -0.27 -0.29 -0.25 -0.31 1.0025 -0.28 -0.32 -0.31 -0.30 -0.34 -0.29 -0.33 -0.32 -0.26 -0.21 -0.15 -0.23 -0.41 -0.34 0.45 -0.21 -0.26 -0.32 -0.23 -0.29 -0.29 -0.29 -0.35 0.644 1.0026 0.39 0.40 0.42 0.40 0.43 0.42 0.43 0.41 0.32 0.28 0.21 0.32 0.45 0.43 -0.24 0.26 0.38 0.36 0.32 0.39 0.37 0.47 0.43 -0.52 -0.45 1.0027 -0.04 -0.05 -0.05 -0.02 -0.09 0.00 -0.05 -0.05 -0.06 -0.04 -0.04 -0.01 -0.11 -0.05 0.44 0.00 0.07 -0.10 -0.07 -0.04 -0.10 0.03 -0.09 0.65 0.47 -0.12 1.0028 0.70 0.72 0.70 0.64 0.62 0.74 0.73 0.72 0.66 0.64 0.63 0.64 0.65 0.66 -0.17 0.51 0.65 0.58 0.59 0.66 0.59 0.66 0.70 0.01 -0.09 0.31 0.31 1.0029 0.84 0.90 0.86 0.81 0.82 0.89 0.90 0.92 0.80 0.76 0.78 0.80 0.84 0.83 -0.27 0.65 0.78 0.77 0.73 0.79 0.71 0.64 0.89 -0.18 -0.24 0.36 0.06 0.80

SPOT Rating Item

Table 5 Rotated Principal Components Analysis with Three Components Extracted

Component

SPOT Item 1 2 3 Description of course objectives and assignments .904 .127 .189Communication of ideas and information .904 .143 .257Expression of expectations for performance in this class .908 .147 .230Availability to assist students in or out of class .868 .115 .228Respect and concern for students .857 .184 .264Stimulation of interest in this course .876 .008 .341Facilitation of learning .903 .143 .287Overall rating of instructor .925 .139 .257

The instructor: Clearly stated the objectives of the course .909 .141 .006Covered what was stated in the course objectives .894 .112 -.001Used class time effectively .882 .007 -.004Seemed interested in teaching .863 .006 .163Seemed to know when students did not understand the course material .851 .231 .293Seemed concerned with whether students learned .865 .152 .274Expected too much work for the number of credits .247 .625 .004Was available during office hours or appointment times .759 .006 .006Encouraged students to think for themselves .806 .003 .310Was willing to listed to students’ questions and opinions .836 .183 .180Told students how they would be evaluated .854 .148 .005Gave assignments that assisted in learning the material .849 .142 .187Gave exams that reflected the material covered .777 .194 .134

As a result of this course: I would like to learn more about this subject .538 .005 .622I would like to take another course with this instructor .866 .179 .305How difficult was this course for you? -.103 -.836 -.288How was the pace at which the instructor covered the material? -.142 -.730 -.362What grade do you expect to receive in this course? .204 .339 .766How much effort have you put into this course? .003 -.883 .211How much do you think that you have learned in this course? .723 -.250 .411What is your rating of this instructor compared to other instructors you have had?

.884 .001 .292

Eigenvalue 17.46 3.00 2.50Percent of variance explained 60.2 10.3 8.6

Item N = 5 N = 10 N = 20 N = 40Description of course objectives 0.59 0.81 0.87 0.93Communication of ideas and info 0.65 0.84 0.89 0.95Expression of expectations 0.57 0.80 0.87 0.93Availability to assist students 0.65 0.78 0.84 0.85Respect and concern for students 0.69 0.81 0.85 0.88Stimulation of interest in course 0.59 0.82 0.87 0.91Facilitation of learning 0.58 0.83 0.86 0.93Overall rating of instructor 0.70 0.85 0.90 0.95Clearly stated objectives of course 0.62 0.79 0.84 0.93Covered what was stated 0.29 0.77 0.82 0.91Used class time effectively 0.49 0.79 0.85 0.94Seemed interested in teaching 0.50 0.81 0.82 0.90Seemed to know when students didn't understand 0.58 0.80 0.86 0.91Seemed concerned with whether students learned 0.56 0.76 0.83 0.92Expected too much work 0.58 0.58 0.74 0.85Was available during office hours 0.52 0.70 0.70 0.77Encouraged students to think 0.28 0.66 0.77 0.82Willing to listen to questions and concerns 0.53 0.78 0.81 0.80Told students how they would be evaluated 0.40 0.74 0.81 0.89Gave assignments that assisted in learning 0.35 0.76 0.84 0.90Gave exams that reflected material 0.52 0.73 0.84 0.92I would like to learn more about subject 0.57 0.73 0.81 0.92I would like to take another course with instructor 0.59 0.81 0.88 0.94How difficult was this course? 0.60 0.73 0.83 0.96How was the pace? 0.49 0.68 0.79 0.88What grade do you expect? 0.52 0.73 0.81 0.91How much effort have you put? 0.50 0.66 0.76 0.87How much have you learned? 0.51 0.68 0.76 0.86Instructor compared to others 0.63 0.83 0.88 0.94

Intraclass Correlation Coefficients by Class SizeTable 6a

Lower Division

Item N = 5 N = 10 N = 20 N = 40Description of course objectives 0.73 0.81 0.88 0.90Communication of ideas and info 0.74 0.83 0.90 0.93Expression of expectations 0.74 0.80 0.88 0.91Availability to assist students 0.74 0.78 0.87 0.91Respect and concern for students 0.72 0.80 0.88 0.93Stimulation of interest in course 0.73 0.80 0.88 0.92Facilitation of learning 0.72 0.81 0.89 0.92Overall rating of instructor 0.78 0.83 0.90 0.94Clearly stated objectives of course 0.74 0.80 0.87 0.88Covered what was stated 0.74 0.78 0.86 0.88Used class time effectively 0.73 0.78 0.86 0.87Seemed interested in teaching 0.72 0.78 0.85 0.91Seemed to know when students didn't understand 0.73 0.79 0.88 0.93Seemed concerned with whether students learned 0.69 0.77 0.87 0.90Expected too much work 0.52 0.65 0.79 0.74Was available during office hours 0.71 0.68 0.78 0.82Encouraged students to think 0.64 0.68 0.81 0.82Willing to listen to questions and concerns 0.70 0.77 0.85 0.93Told students how they would be evaluated 0.70 0.73 0.84 0.86Gave assignments that assisted in learning 0.64 0.76 0.86 0.88Gave exams that reflected material 0.61 0.76 0.86 0.89I would like to learn more about subject 0.59 0.71 0.80 0.85I would like to take another course with instructor 0.74 0.80 0.88 0.92How difficult was this course? 0.56 0.73 0.88 0.90How was the pace? 0.63 0.71 0.83 0.89What grade do you expect? 0.50 0.75 0.86 0.89How much effort have you put? 0.52 0.68 0.81 0.81How much have you learned? 0.62 0.73 0.83 0.81Instructor compared to others 0.75 0.82 0.89 0.93

Table 6bIntraclass Correlation Coefficients by Class Size

Upper Division

Item N = 5 N = 10 N = 20Description of course objectives 0.66 0.81 0.90Communication of ideas and info 0.71 0.83 0.90Expression of expectations 0.70 0.82 0.91Availability to assist students 0.66 0.77 0.84Respect and concern for students 0.64 0.78 0.86Stimulation of interest in course 0.69 0.81 0.88Facilitation of learning 0.71 0.82 0.89Overall rating of instructor 0.73 0.84 0.91Clearly stated objectives of course 0.63 0.79 0.92Covered what was stated 0.65 0.82 0.90Used class time effectively 0.63 0.81 0.89Seemed interested in teaching 0.61 0.79 0.86Seemed to know when students didn't understand 0.61 0.75 0.86Seemed concerned with whether students learned 0.58 0.75 0.84Expected too much work 0.41 0.70 0.73Was available during office hours 0.51 0.67 0.74Encouraged students to think 0.50 0.72 0.78Willing to listen to questions and concerns 0.62 0.73 0.84Told students how they would be evaluated 0.49 0.78 0.89Gave assignments that assisted in learning 0.60 0.77 0.86Gave exams that reflected material 0.55 0.77 0.84I would like to learn more about subject 0.47 0.69 0.80I would like to take another course with instructor 0.64 0.80 0.89How difficult was this course? 0.59 0.78 0.86How was the pace? 0.53 0.75 0.86What grade do you expect? 0.49 0.68 0.80How much effort have you put? 0.58 0.72 0.82How much have you learned? 0.58 0.77 0.82Instructor compared to others 0.72 0.83 0.86

Table 6cIntraclass Correlation Coefficients by Class Size

Graduate

Same Instructor Same instructor Different instructorSame Course Different course Same course

2 successive terms Same term Same termItem

Description of course objectives 0.58 0.51 0.17Communication of ideas and info 0.63 0.54 0.19Expression of expectations 0.59 0.51 0.18Availability to assist students 0.56 0.52 0.14Respect and concern for students 0.58 0.54 0.16Stimulation of interest in course 0.62 0.54 0.20Facilitation of learning 0.61 0.53 0.18Overall rating of instructor 0.61 0.56 0.17Clearly stated objectives of course 0.53 0.46 0.16Covered what was stated 0.49 0.44 0.13Used class time effectively 0.51 0.45 0.12Seemed interested in teaching 0.53 0.47 0.17Seemed to know when students didn't understand 0.58 0.50 0.21Seemed concerned with whether students learned 0.56 0.50 0.16Expected too much work 0.46 0.37 0.24Was available during office hours 0.43 0.44 0.13Encouraged students to think 0.52 0.44 0.18Willing to listen to questions and concerns 0.53 0.47 0.14Told students how they would be evaluated 0.51 0.41 0.15Gave assignments that assisted in learning 0.52 0.42 0.15Gave exams that reflected material 0.50 0.41 0.15I would like to learn more about subject 0.65 0.40 0.43I would like to take another course with instructor 0.61 0.49 0.14How difficult was this course? 0.70 0.43 0.41How was the pace? 0.56 0.36 0.23What grade do you expect? 0.79 0.62 0.54How much effort have you put? 0.62 0.34 0.31How much have you learned? 0.55 0.42 0.22Instructor compared to others 0.61 0.56 0.13

Average 0.57 0.47 0.20Note: Spearman's rho Fall and spring termsClass size within 10

Table 7Generalizability of Instructor Ratings

analysis of the psychometric properties of the student ...iea.fau.edu/inst/spot04.pdf · an...

Documents