Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Evaluating Faculty: Articles on Student Evaluations

Information and resources related to: >Interpreting student evaluations >Observing classes face to face >Observing online classes

Interpreting and Using Student Ratings Data: Guidance for Faculty Serving as Administrators and on Evaluation Committees

Linse, Angela R. “Interpreting and Using Student Ratings Data: Guidance for Faculty Serving as Administrators and on Evaluation Committees.” Studies in Educational Evaluation, vol. 54, Elsevier Ltd, Sept. 2017, pp. 94–106,  (Link)

Abstract: This article is about the accurate interpretation of student ratings data and the appropriate use of that data to evaluate faculty. Its aim is to make recommendations for use and interpretation based on more than 80 years of student ratings research. As more colleges and universities use student ratings data to guide personnel decisions, it is critical that administrators and faculty evaluators have access to research-based information about their use and interpretation.

The article begins with an overview of common views and misconceptions about student ratings, followed by clarification of what student ratings are and are not. Next are two sections that provide advice for two audiences—administrators and faculty evaluators—to help them accurately, responsibly, and appropriately use and interpret student ratings data. A list of administrator questions is followed by a list of advice for faculty responsible for evaluating other faculty members’ records.

“Measuring Teaching Quality in Higher Education: Assessing Selection Bias Course Evaluations"

Goos, Maarten, and Salomons, Anna. “Measuring Teaching Quality in Higher Education: Assessing Selection Bias in Course Evaluations.” Research in Higher Education, vol. 58, no. 4, June 2017, pp. 341–64, doi:10.1007/s11162-016-9429-8. 
(Link)

Abstract:  Student evaluations of teaching (SETs) are widely used to measure teaching quality in higher education and compare it across different courses, teachers, departments and institutions. Indeed, SETs are of increasing importance for teacher promotion decisions, student course selection, as well as for auditing practices demonstrating institutional performance. However, survey response is typically low, rendering these uses unwarranted if students who respond to the evaluation are not randomly selected along observed and unobserved dimensions. This paper is the first to fully quantify this problem by analyzing the direction and size of selection bias resulting from both observed and unobserved characteristics for over 3000 courses taught in a large European university. We find that course evaluations are upward biased, and that correcting for selection bias has non-negligible effects on the average evaluation score and on the evaluation-based ranking of courses. Moreover, this bias mostly derives from selection on unobserved characteristics, implying that correcting evaluation scores for observed factors such as student grades does not solve the problem. However, we find that adjusting for selection only has small impacts on the measured effects of observables on SETs, validating a large related literature which considers the observable determinants of evaluation scores without correcting for selection bias.  

Gender Bias in Student Evaluations

"Considering Teaching History and Calculating Confidence Intervals in Student Evaluations of Teaching Quality”

Fraile, Rubén, and Francisco Bosch-Morell. “Considering Teaching History and Calculating Confidence Intervals in Student Evaluations of Teaching Quality.” Higher Education (00181560), vol. 70, no. 1, July 2015, pp. 55–72. EBSCOhost, doi:10.1007/s10734-014-9823-0.
(Link)

Abstract: Lecturer promotion and tenure decisions are critical both for university management and for the affected lecturers. Therefore, they should be made cautiously and based on reliable information. Student evaluations of teaching quality are among the most used and analyzed sources of such information. However, to date little attention has been paid in how to process them in order to be able to estimate their reliability. Within this paper we present an approach that provides estimates of such reliability in terms of confidence intervals. This approach, based on Bayesian inference, also provides a means for improving reliability even for lecturers having a low number of student evaluations. Such

improvement is achieved by using past information in every year’s evaluations. Results of applying the proposed procedure to university-wide data corresponding to two consecutive years are discussed.

"Students’ Perceptions of the Teaching Evaluation Process"

Kite, Mary E., et al. “Students’ Perceptions of the Teaching Evaluation Process.” Teaching of Psychology, vol. 42, no. 4, Oct. 2015, pp. 307–314. EBSCOhost, doi:10.1177/0098628315603062.
 (Link)

Abstract: We explored how students view the teaching evaluation process and assessed their self-reported behaviors when completing student evaluations of teaching (SETs). We administered a 28-item survey assessing these views to students from a cross section of majors across 20 institutions (N = 597). Responses to this measure were analyzed using exploratory factor analysis. Students also answered an open-ended question about their views; responses were coded into 21 categories. We found that students generally held positive views about the evaluation process and that, overall, these positive views were consistent across type of institution, academic discipline, class standing, and respondent gender. However, compared with small and midsize institutions, community college students were more positive about the usefulness of SETs, and seniors reported greater willingness to provide specific feedback (such as by providing written comments) when completing SETs compared to sophomores and juniors. We conclude by providing suggestions for improving the evaluation process based on our findings. [ABSTRACT FROM AUTHOR]

The Teacher Behaviors Checklist: Factor Analysis of Its Utility for Evaluating Teaching

Keeley, Jared, et al. “The Teacher Behaviors Checklist: Factor Analysis of Its Utility for Evaluating Teaching.” Teaching of Psychology, vol. 33, no. 2, Spring 2006, pp. 84–91. EBSCOhost,  (Link)

Abstract: We converted the Teacher Behaviors Checklist (TBC; Buskist,Sikorski, Buckley,&Saville, 2002) to an evaluative instrument to assess teaching by adding specific instructions and a Likert-type scale. Factor analysis of the modified TBC produced 2 subscales: caring and supportive and professional competency and communication skills. Further psychometric analysis suggested the instrument possessed excellent construct validity and reliability, underscoring its potential as a tool for assessing teaching. This instrument clearly identifies specific target teaching behaviors that instructors can alter to attempt to improve their teaching effectiveness.

Student Feedback on Teaching: Why Mean Ratings May Not Tell the Full Story

Student Ratings of Teaching: A Summary of Research and Literature

Student Evaluation of Teaching: A Study Exploring Student Rating Instrument Free-form Text Comment

"How to Read a Student Evaluation of Your Teaching"

Perlmutter, David. "How to Read a Student Evaluation of Your Teaching". Chronicle of Higher Education. October 30, 2011
(Link)

First two paragraphs: The fall semester is under way, your courses are exciting, and you are busily "professing" about biochemistry, microeconomics, or Middlemarch to students encountering you for the first time. Surely they will know how much you care, how hard you have worked to be here, how much they have to learn from you.

Well, one would hope so. But at the end of the term you will get some hard data on: (a) how well they performed on the measures you created to test their learning and (b) how well you fared in the measures the university created to test your teaching.

David D. Perlmutter is director of the School of Journalism and Mass Communication and a professor at the University of Iowa

"How to Use Student Evaluations Wisely"

Perlmutter, David. "How to Use Student Evaluations Wisely". ChronicleVitae.  June 16, 2015
(Link)

First paragraph: When I was a doctoral student, nervously facing my first set of student evaluations, I turned for advice to my father, who was already a professor when those evaluations were first introduced. “We should be polling students to see what they thought of our classes,” he insisted. “Of course, their evaluations can’t signify the be-all and end-all for what constitutes effective teaching.” His position sounded sensible to me then -- and still does, now that I am a dean.

David D. Perlmutter is director of the School of Journalism and Mass Communication and a professor at the University of Iowa.

"In Defense (Sort of) of Student Evaluations of Teaching"

Gannon, Kevin. "In Defense (Sort of) of Student Evaluations of Teaching". Chronicle of Higher Education   May 9, 2018
(Link)

First paragraph:  A couple of weeks after the end of my first semester of teaching as the instructor of record, I received "the packet" in my campus mailbox — an interoffice envelope stuffed with course evaluations from my students. Those evaluations mattered a lot to me at the time, as I was still figuring out this whole teaching thing. Was I doing a good job? Did my students like the class? And, more selfishly, did they like me?

Kevin Gannon is a professor of history at Grand View University and director of its Center for Excellence in Teaching and Learning.

An Evaluation of Course Evaluations

"What Instructor Qualities Do Students Reward?"

Pepe, Julie W, and Wang, Morgan C. “What Instructor Qualities Do Students Reward?” College Student Journal, vol. 46, no. 3, Project Innovation, Inc, Sept. 2012, pp. 603–14.
(Link)

Abstract: Most higher education institutions have a policy regarding instructor evaluation and students play a dominant role in evaluation of classroom instruction. A standardized course/instructor evaluation form was used to understand the relationship of item responses on the student evaluation form, to the overall instructor score given by students taking general education program (GEP) courses. All student evaluation information from all GEP courses at a large public metropolitan university in the southeast United States for fall 2002 through spring 2009 semesters was used for data analysis. Results suggest that students reward, with higher evaluation scores, instructors who they perceive as organized and strive to clearly communicate content. Additionally, instructors of GEP courses need to be informed that students connect the level of respect and concern shown by the instructor and having an interest in student learning with the overall score they give the instructor. Course characteristics were related to the relative starting point for the scale chosen by students, but instructor qualities were consistent when considering class size, class mode and course area (English, mathematics, communication, etc...) . Individual instructor characteristics were not considered in this study.

Why Good Teaching Evaluations May Reward Bad Teaching

"Fine Tuning Teacher Evaluations"

Marshall, Kim. “Fine-Tuning Teacher Evaluation.” Educational Leadership, vol. 70, no. 3, Nov. 2012, p. 50. EBSCOhost, search.ebscohost.com/login.aspx?direct=true&AuthType=ip&db=ulh&AN=83173918&site=ehost-live&scope=site.
(Link)

Abstract: As many states and districts rethink teacher supervision and evaluation, the team at the Measures of Effective Teaching (MET) Project, funded by the Bill and Melinda Gates Foundation, has analyzed thousands of lesson videotapes and studied the shortcomings of current practices. The tentative conclusion: Teachers should be evaluated on three factors--classroom observations, student achievement gains, and feedback from students. The use of multiple measures is meant to compensate for the imperfections of each individual measure and produce more accurate and helpful evaluations (Kane & Cantrell, 2012). This approach makes sense, but its effectiveness will depend largely on how classroom observations, achievement data, and student feedback are used. As states and districts rethink their teacher evaluation policies, the author urges them to consider the enhancements to classroom observations, the use of achievement data, and student input that he suggests in this article.

Best-Practices-in-Using-Aggregate-Course-Evaluation

The Adequacy of Response Rates to Online and Paper Surveys: What Can Be Done?