Highline College Library: Evaluating Faculty: Articles on Student Evaluations

"How to Read a Student Evaluation of Your Teaching"

Perlmutter, David. "How to Read a Student Evaluation of Your Teaching". Chronicle of Higher Education. October 30, 2011
(Link)

First two paragraphs: The fall semester is under way, your courses are exciting, and you are busily "professing" about biochemistry, microeconomics, or Middlemarch to students encountering you for the first time. Surely they will know how much you care, how hard you have worked to be here, how much they have to learn from you.

Well, one would hope so. But at the end of the term you will get some hard data on: (a) how well they performed on the measures you created to test their learning and (b) how well you fared in the measures the university created to test your teaching.

David D. Perlmutter is director of the School of Journalism and Mass Communication and a professor at the University of Iowa

Student Feedback on Teaching: Why Mean Ratings May Not Tell the Full Story

Student Feedback on Teaching: Why Mean Ratings May Not Tell the Full Story
Kennesaw State University Center for Excellence in Teaching and Learning (Dec. 2016)

Since 2010, KSU has been collecting student feedback on teaching using an online system. This system has saved considerably on paper and staff processing time compared to the paper forms used in the past. However, I am aware that faculty have expressed some concerns about the appropriate interpretation and use of student feedback. I would like to address one of these concerns, which is the use and interpretation of the mean (average) ratings of the items on the form.

Interpreting and Using Student Ratings Data: Guidance for Faculty Serving as Administrators and on Evaluation Committees

Linse, Angela R. “Interpreting and Using Student Ratings Data: Guidance for Faculty Serving as Administrators and on Evaluation Committees.” Studies in Educational Evaluation, vol. 54, Elsevier Ltd, Sept. 2017, pp. 94–106, (Link)

Abstract: This article is about the accurate interpretation of student ratings data and the appropriate use of that data to evaluate faculty. Its aim is to make recommendations for use and interpretation based on more than 80 years of student ratings research. As more colleges and universities use student ratings data to guide personnel decisions, it is critical that administrators and faculty evaluators have access to research-based information about their use and interpretation.

The article begins with an overview of common views and misconceptions about student ratings, followed by clarification of what student ratings are and are not. Next are two sections that provide advice for two audiences—administrators and faculty evaluators—to help them accurately, responsibly, and appropriately use and interpret student ratings data. A list of administrator questions is followed by a list of advice for faculty responsible for evaluating other faculty members’ records.

The Teacher Behaviors Checklist: Factor Analysis of Its Utility for Evaluating Teaching

Keeley, Jared, et al. “The Teacher Behaviors Checklist: Factor Analysis of Its Utility for Evaluating Teaching.” Teaching of Psychology, vol. 33, no. 2, Spring 2006, pp. 84–91. EBSCOhost, (Link)

Abstract: We converted the Teacher Behaviors Checklist (TBC; Buskist,Sikorski, Buckley,&Saville, 2002) to an evaluative instrument to assess teaching by adding specific instructions and a Likert-type scale. Factor analysis of the modified TBC produced 2 subscales: caring and supportive and professional competency and communication skills. Further psychometric analysis suggested the instrument possessed excellent construct validity and reliability, underscoring its potential as a tool for assessing teaching. This instrument clearly identifies specific target teaching behaviors that instructors can alter to attempt to improve their teaching effectiveness.

An Evaluation of Course Evaluations

An Evaluation of Course Evaluations
Science Open (2014)

Abstract - Student ratings of teaching have been used, studied, and debated for almost a century. This article examines student ratings of teaching from a statistical perspective. The common practice of relying on averages of student teaching evaluation scores as the primary measure of teaching effectiveness for promotion and tenure decisions should be abandoned for substantive and statistical reasons: There is strong evidence that student responses to questions of “effectiveness” do not measure teaching effectiveness. Response rates and response variability matter. And comparing averages of categorical responses, even if the categories are represented by numbers, makes little sense. Student ratings of teaching are valuable when they ask the right questions, report response rates and score distributions, and are balanced by a variety of other sources and methods to evaluate teaching.

Student Ratings of Teaching: A Summary of Research and Literature

Student Ratings of Teaching: A Summary of Research and Literature
IDEA Paper #50 (2010)
This IDEA Paper is an update of IDEA Paper No. 32 Student Ratings of Teaching: The Research Revisited (Cashin, 1995). Much of the content of IDEA Paper No. 32 is retained where no subsequently published study has changed its basic conclusions. However, studies or reviews of the literature that provided questions, modifications, or further support for its conclusions were included in this paper. We have attempted to summarize the conclusions of the major reviews of the student ratings research and literature from the 1970s to 2010. That literature is extensive and complex; a paper this brief can offer only broad, general summaries and limited citations.

“Measuring Teaching Quality in Higher Education: Assessing Selection Bias Course Evaluations"

Goos, Maarten, and Salomons, Anna. “Measuring Teaching Quality in Higher Education: Assessing Selection Bias in Course Evaluations.” Research in Higher Education, vol. 58, no. 4, June 2017, pp. 341–64, doi:10.1007/s11162-016-9429-8.
(Link)

Abstract: Student evaluations of teaching (SETs) are widely used to measure teaching quality in higher education and compare it across different courses, teachers, departments and institutions. Indeed, SETs are of increasing importance for teacher promotion decisions, student course selection, as well as for auditing practices demonstrating institutional performance. However, survey response is typically low, rendering these uses unwarranted if students who respond to the evaluation are not randomly selected along observed and unobserved dimensions. This paper is the first to fully quantify this problem by analyzing the direction and size of selection bias resulting from both observed and unobserved characteristics for over 3000 courses taught in a large European university. We find that course evaluations are upward biased, and that correcting for selection bias has non-negligible effects on the average evaluation score and on the evaluation-based ranking of courses. Moreover, this bias mostly derives from selection on unobserved characteristics, implying that correcting evaluation scores for observed factors such as student grades does not solve the problem. However, we find that adjusting for selection only has small impacts on the measured effects of observables on SETs, validating a large related literature which considers the observable determinants of evaluation scores without correcting for selection bias.

Gender Bias in Student Evaluations

Gender Bias in Student Evaluations
The Teacher (2018)

Abstract - Many universities use student evaluations of teachers (SETs) as part of consideration for tenure, compensation, and other employment decisions. However, in doing so, they may be engaging in discriminatory practices against female academics. This study further explores the relationship between gender and SETs described by MacNell, Driscoll, and Hunt (2015) by using both content analysis in student-evaluation comments and quantitative analysis of students’ ordinal scoring of their instructors. The authors show that the language students use in evaluations regarding male professors is significantly different than language used in evaluating female professors. They also show that a male instructor administering an identical online course as a female instructor receives higher ordinal scores in teaching evaluations, even when questions are not instructor-specific. Findings suggest that the relationship between gender and teaching evaluations may indicate that the use of evaluations in employment decisions is discriminatory against women.

"In Defense (Sort of) of Student Evaluations of Teaching"

Gannon, Kevin. "In Defense (Sort of) of Student Evaluations of Teaching". Chronicle of Higher Education May 9, 2018
(Link)

First paragraph: A couple of weeks after the end of my first semester of teaching as the instructor of record, I received "the packet" in my campus mailbox — an interoffice envelope stuffed with course evaluations from my students. Those evaluations mattered a lot to me at the time, as I was still figuring out this whole teaching thing. Was I doing a good job? Did my students like the class? And, more selfishly, did they like me?

Kevin Gannon is a professor of history at Grand View University and director of its Center for Excellence in Teaching and Learning.

The Adequacy of Response Rates to Online and Paper Surveys: What Can Be Done?

The Adequacy of Response Rates to Online and Paper Surveys: What Can Be Done?
Assessment & Evaluation in Higher Education (2008)

This article is about differences between, and the adequacy of, response rates to online and paper-based course and teaching evaluation surveys. Its aim is to provide practical guidance on these matters. The first part of the article gives an overview of online surveying in general, a review of data relating to survey response rates and practical advice to help boost response rates. The second part of the article discusses when a response rate may be considered large enough for the survey data to provide adequate evidence for accountability and improvement purposes. The article ends with suggestions for improving the effectiveness of evaluation strategy. These suggestions are: to seek to obtain the highest response rates possible to all surveys; to take account of probable effects of survey design and methods on the feedback obtained when interpreting that feedback; and to enhance this action by making use of data derived from multiple methods of gathering feedback.

"Fine Tuning Teacher Evaluations"

Marshall, Kim. “Fine-Tuning Teacher Evaluation.” Educational Leadership, vol. 70, no. 3, Nov. 2012, p. 50. EBSCOhost, search.ebscohost.com/login.aspx?direct=true&AuthType=ip&db=ulh&AN=83173918&site=ehost-live&scope=site.
(Link)

Abstract: As many states and districts rethink teacher supervision and evaluation, the team at the Measures of Effective Teaching (MET) Project, funded by the Bill and Melinda Gates Foundation, has analyzed thousands of lesson videotapes and studied the shortcomings of current practices. The tentative conclusion: Teachers should be evaluated on three factors--classroom observations, student achievement gains, and feedback from students. The use of multiple measures is meant to compensate for the imperfections of each individual measure and produce more accurate and helpful evaluations (Kane & Cantrell, 2012). This approach makes sense, but its effectiveness will depend largely on how classroom observations, achievement data, and student feedback are used. As states and districts rethink their teacher evaluation policies, the author urges them to consider the enhancements to classroom observations, the use of achievement data, and student input that he suggests in this article.

"What Instructor Qualities Do Students Reward?"

Pepe, Julie W, and Wang, Morgan C. “What Instructor Qualities Do Students Reward?” College Student Journal, vol. 46, no. 3, Project Innovation, Inc, Sept. 2012, pp. 603–14.
(Link)

Abstract: Most higher education institutions have a policy regarding instructor evaluation and students play a dominant role in evaluation of classroom instruction. A standardized course/instructor evaluation form was used to understand the relationship of item responses on the student evaluation form, to the overall instructor score given by students taking general education program (GEP) courses. All student evaluation information from all GEP courses at a large public metropolitan university in the southeast United States for fall 2002 through spring 2009 semesters was used for data analysis. Results suggest that students reward, with higher evaluation scores, instructors who they perceive as organized and strive to clearly communicate content. Additionally, instructors of GEP courses need to be informed that students connect the level of respect and concern shown by the instructor and having an interest in student learning with the overall score they give the instructor. Course characteristics were related to the relative starting point for the scale chosen by students, but instructor qualities were consistent when considering class size, class mode and course area (English, mathematics, communication, etc...) . Individual instructor characteristics were not considered in this study.

Why Good Teaching Evaluations May Reward Bad Teaching

Why Good Teaching Evaluations May Reward Bad Teaching: On Grade Inflation and Other Unintended Consequences of Student Evaluations
Association for Psychological Science (2015)
Abstract - In this article, I address the paradox that university grade point averages have increased for decades, whereas the time students invest in their studies has decreased. I argue that one major contributor to this paradox is grading leniency, encouraged by the practice of university administrators to base important personnel decisions on student evaluations of teaching. Grading leniency creates strong incentives for instructors to teach in ways that would result in good student evaluations. Because many instructors believe that the average student prefers courses that are entertaining, require little work, and result in high grades, they feel under pressure to conform to those expectations. Evidence is presented that the positive association between student grades and their evaluation of teaching reflects a bias rather than teaching effectiveness. If good teaching evaluations reflected improved student learning due to effective teaching, they should be positively related to the grades received in subsequent courses that build on knowledge gained in the previous course. Findings that teaching evaluations of concurrent courses, though positively correlated with concurrent grades, are negatively related to student performance in subsequent courses are more consistent with the assumption that concurrent evaluations are the result of lenient grading rather than effective teaching. Policy implications are discussed.

"Students’ Perceptions of the Teaching Evaluation Process"

Kite, Mary E., et al. “Students’ Perceptions of the Teaching Evaluation Process.” Teaching of Psychology, vol. 42, no. 4, Oct. 2015, pp. 307–314. EBSCOhost, doi:10.1177/0098628315603062.
(Link)

Abstract: We explored how students view the teaching evaluation process and assessed their self-reported behaviors when completing student evaluations of teaching (SETs). We administered a 28-item survey assessing these views to students from a cross section of majors across 20 institutions (N = 597). Responses to this measure were analyzed using exploratory factor analysis. Students also answered an open-ended question about their views; responses were coded into 21 categories. We found that students generally held positive views about the evaluation process and that, overall, these positive views were consistent across type of institution, academic discipline, class standing, and respondent gender. However, compared with small and midsize institutions, community college students were more positive about the usefulness of SETs, and seniors reported greater willingness to provide specific feedback (such as by providing written comments) when completing SETs compared to sophomores and juniors. We conclude by providing suggestions for improving the evaluation process based on our findings. [ABSTRACT FROM AUTHOR]

"Considering Teaching History and Calculating Confidence Intervals in Student Evaluations of Teaching Quality”

Fraile, Rubén, and Francisco Bosch-Morell. “Considering Teaching History and Calculating Confidence Intervals in Student Evaluations of Teaching Quality.” Higher Education (00181560), vol. 70, no. 1, July 2015, pp. 55–72. EBSCOhost, doi:10.1007/s10734-014-9823-0.
(Link)

Abstract: Lecturer promotion and tenure decisions are critical both for university management and for the affected lecturers. Therefore, they should be made cautiously and based on reliable information. Student evaluations of teaching quality are among the most used and analyzed sources of such information. However, to date little attention has been paid in how to process them in order to be able to estimate their reliability. Within this paper we present an approach that provides estimates of such reliability in terms of confidence intervals. This approach, based on Bayesian inference, also provides a means for improving reliability even for lecturers having a low number of student evaluations. Such

improvement is achieved by using past information in every year’s evaluations. Results of applying the proposed procedure to university-wide data corresponding to two consecutive years are discussed.

Best-Practices-in-Using-Aggregate-Course-Evaluation

Best-Practices-in-Using-Aggregate-Course-Evaluation
Hanover Research (Aug. 2014)

Main Conclusions:

Traditional course evaluations are largely appropriate for online courses as well. Research has found that instruments used in face‐to‐face classes produce similar results when used in online courses.

Results of course evaluations are commonly reported at the departmental and institutional levels, as well as for individual faculty.

Some institutions are moving to give students access to course evaluation results.

Institutions most commonly use course evaluation results for summative purposes, such as tenure or promotion review.

A faculty handbook should set forth relatively specific guidelines for the use of course evaluations.

"How to Use Student Evaluations Wisely"

Perlmutter, David. "How to Use Student Evaluations Wisely". ChronicleVitae. June 16, 2015
(Link)

First paragraph: When I was a doctoral student, nervously facing my first set of student evaluations, I turned for advice to my father, who was already a professor when those evaluations were first introduced. “We should be polling students to see what they thought of our classes,” he insisted. “Of course, their evaluations can’t signify the be-all and end-all for what constitutes effective teaching.” His position sounded sensible to me then -- and still does, now that I am a dean.

David D. Perlmutter is director of the School of Journalism and Mass Communication and a professor at the University of Iowa.