Saying No to ‘Cut Scores’ and Derogatory Labels in Teacher Evaluation
In the January 2019 District Trendline by the National Center on Teacher Quality, the focus concerned the language that was connected with teacher evaluation rubrics. In a quick report, NCTQ looked at the language of the evaluation rubrics used by 123 large school districts across the United States.
Their findings suggest that teacher evaluation ratings in most school districts rely heavily on labels. In the below chart, NCTQ breaks down tiers of effectiveness with the common language used to represent that tier:
The findings also delve into summative evaluation procedures and how school districts provide summative evaluations. In the study, 121 out of 123 school districts gave a consolidated label that provided the sole determination of a teacher’s effort for the year.
Only one of the school districts studied provided a similar process to the Network for Educator Effectiveness’ summative process. That school district (Burlington, Vermont) had instructional leaders provide a recommendation of teachers for contract renewal, assistance/supervision, but did not assign one consolidated label for the teacher for the year.
Importance to NEE
This article speaks highly to the NEE Advantage we so often talk about. The Network for Educator Effectiveness disrupts the notion that labels have to be used in teacher evaluation. NEE focuses on trying to make all scoring rubrics as objective as possible, while acknowledging the complex human behaviors inherent to evaluation.
At the end of the post, the author asks a few questions. First, I want to discuss the question: “Will principals be more comfortable rating someone as ‘needs improvement’ than ‘unacceptable’?” From our beginning, this is a question NEE has worked to make administrators never have to answer. By using a numerical system based on percentage of time and/or percentage of students involved within a teaching practice, NEE attempts to disarm these unfair and harsh labels from the teaching profession.
Teaching is a complex profession, and it is antithetical to fostering growth through classroom observations (or any other evaluation rubric) by using derogatory labels. We hear consistently from our administrators that even using a score on the lower side of a rubric helps to stonewall conversations and can make teachers defensive. That is backed up from research, inside of the NEE system as well as external systems, which indicates administrators are prone to artificially inflating low scores.
Evaluation systems, including NEE, must find a path forward that provides opportunities for conversation and growth of teachers that are not performing at an expected level. Those conversations are never going to happen from using derogatory and inflammatory language within the rubrics themselves.
The last question is also one I wish to talk about. The author asks: “Where did you set the ‘acceptable’ level in your system, and how many rating levels fall below or above it?”
This question provides another look at an often misused part of classroom observations and teacher evaluations. As mentioned earlier, teaching is complex. And when evaluating a multitude of effective teaching practices, it is unfair – and possibly even impossible – to be performing effectively on all teaching practices at once. On top of that, teaching practices differ in how hard they are to incorporate or use on a consistent basis.
Too often, evaluation systems try to play “gotcha” with “acceptable levels” or, in other terms, cut scores. At the Network for Educator Effectiveness, we caution against setting acceptable levels, as they take autonomy away from the teacher and may fail to take into account why teaching practices may be absent or not incorporated during an observation. Acceptable levels should be left to conversations between the instructional leader and the teacher and not predetermined without intimate knowledge of the specific classroom and the specific teacher.
The benefit comes in the conversation. Both of the questions and the examples of derogatory language and “cut scores” make conversations more difficult. An evaluation system should set a space that allows for constructive conversations to happen in a hospitable and open manner. This is what we seek at NEE. We believe that the score itself – any score from any evaluation – is only the beginning of the conversation. It is only data. An instructional leader must be prepared after collecting data to use it in a way that sets the path towards continued growth and a continued positive culture. The conversation is what matters.
The Network for Educator Effectiveness (NEE) is a simple yet powerful comprehensive system for educator evaluation that helps educators grow, students learn, and schools improve. Developed by preK-12 practitioners and experts at the University of Missouri, NEE brings together classroom observation, student feedback, teacher curriculum planning, and professional development as measures of effectiveness in a secure online portal designed to promote educator growth and development.