Teacher Evaluation Overview
Teacher Evaluation PDF
Cleaver, S., Detrich, R. & States, J. (2018). Overview of Teacher Evaluation. Oakland, CA: The Wing Institute. https://www.winginstitute.org/quality-teachers-evaluation.
As students progress through school, many elements—home experiences, classroom instruction, and internal factors—influence their eventual outcomes. In the school environment, a teacher’s skills, strengths, and abilities have as much of an influence on student learning as student background (Wenglinsky, 2002). Put another way, teachers matter; teachers who are effective contribute to positive student outcomes and achievement (Johnson & Zwick, 1990; Nye, Konstantopoulus, & Hedges, 2004; Sanders, Wright, & Horn, 1997), so it is important to understand what effective teachers do that influence student outcomes. Equally important is to provide teachers with information and feedback they can use to become better practitioners. That’s where teacher evaluation comes in.
Teacher Evaluation
Teacher evaluation is conducted to ensure teacher quality and to promote professional learning with the goal of improving future performance (Danielson, 2010). A basic definition of teacher evaluation is the formal process used to review teacher performance and effectiveness in the classroom (Sawchuk, 2015). However, this definition is an oversimplification. In practice, teacher evaluation involves understanding and agreeing on the inputs (e.g., the practices that define quality teaching), outputs (e.g., student achievement measures), and methods of evaluation (e.g., student assessment data, teacher observation rubrics). The elements of evaluation are rarely agreed on (Goe, Bell, & Little, 2008). This overview provides information about teacher evaluation as it relates to collecting information about teacher practice and using it to improve student outcomes.
Teacher Evaluation for Improvement and Accountability
Teacher evaluation serves two purposes: improvement and accountability. Evaluation provides teachers with information that can improve their practice and serve as a starting point for professional development; for example, using information from teacher evaluations to set a plan of study for professional learning community (PLC) meetings. Evaluation provides accountability when information gained from the evaluation is used to guide decisions regarding bonuses, firing, and other human resource decisions (Santiago & Benavides, 2009).
There is an inherent tension between these two purposes. On one hand, when teachers feel they are focused on improvement, accountability can feel incongruent and teachers may not want to provide accurate information because of the risk of revealing weaknesses. On the other hand, when the focus is on accountability, teachers may feel insecure about their work (Santiago & Benavides, 2009). Goals around improvement may hinder the ability to use evaluation for accountability decisions, while goals around accountability may prevent or obfuscate improvement efforts. If the teacher evaluation process becomes too cumbersome or aversive for either the teacher or evaluator, the process will be in jeopardy.
Summative and Formative Evaluation
Teacher evaluation can serve a summative or formative purpose. Summative evaluation provides conclusive evaluation of a teacher’s performance to determine how well that individual has done his or her work (Marzano, 2012). In this type of evaluation, a supervisor evaluates a teacher using a combination of measures that may include student test scores, lesson plans and artifacts, and rating scales or rubrics. Teachers are not involved and the results are used for accountability decisions such as pay awards or dismissal (Marzano, 2012).
Formative evaluation provides ongoing information about teacher practice with the goal of providing feedback that helps teachers improve. Teachers are often involved in the process through self-reflection or self-assessment. The results of the evaluation may be used to give teachers feedback, and to make decisions regarding the professional development or coaching support that teachers receive (Sayavedra, 2014).
History and Current State of Teacher Evaluation
In the early 20th century, the framework of scientific management, or the idea that every task can be broken down into its best and most efficient method, was applied to education (Marzano, Frontier, & Livingston, 2011). This started a focus on examining teacher behavior, providing suggestions for feedback, and evaluating effectiveness in the classroom (Marzano et al., 2011). Since World War II, the role of evaluation has evolved. Clinical supervision, popular in the 1960s and 1970s, was the first major trend. It involved a pre-observation conference, teacher observation, reflection, and analysis with a focus on classroom behaviors that directly impacted learning. In the 1980s, the Hunter lesson design, also called mastery teaching, was incorporated into observation and evaluation so that administrators observed a specific lesson sequence: anticipatory set, objective and purpose, input, model, checking for understanding, guided practice, and independent practice (Hunter, 1984).
In the mid-1980s, alternatives to clinical supervision and mastery teaching were proposed. In these alternatives, the teacher became a core element in evaluation and principals were expected to differentiate observation and evaluation depending on teachers’ needs and experience (Marzano et al., 2011). Throughout the 1980s and 1990s, there was a shift away from structured observation, along with a move toward formal teacher evaluation (Marzano et al., 2011).
One of these shifts was prompted by a RAND group study of 32 districts across the United States (Wise, Darling-Hammond, McLaughlin, & Bernstein, 1984). The RAND study concluded that there were four primary concerns regarding then-current evaluation: (a) Principals were not committed or able to provide accurate evaluations, (b) teachers were not open to receiving feedback, (c) evaluation practices were not uniform, and (d) evaluators were not trained (Wise et al., 1984). The RAND study also outlined the following recommendations for evaluation:
- Evaluation systems should align with goals without being overly prescriptive.
- Principals need time, training, and oversight to implement evaluations effectively.
- An evaluation system should align with the overarching purpose (and a district may need multiple evaluations to align with multiple goals).
- Resources need to be provided and allocated effectively.
- Teachers need to be involved in the design, monitoring, and implementation of evaluation systems.
Throughout the 20th century, teacher evaluation was a district-level initiative, more focused on teacher behavior and administrative supervision. In the 21st century, teacher evaluation has become a focus of national policy, and the emphasis has shifted to evaluation of teacher quality and student achievement (Marzano et al., 2011).
In the late 2000s, two reports critiqued the teacher evaluation system and set the stage for the current conversation. First, Toch and Rothman’s report Rush to Judgment critiqued teacher evaluation as “superficial and capricious” (2008, p. 1) and ascertained that it did not measure student learning. And, despite No Child Left Behind requirements, Toch and Rothman found only 14 states that required annual teacher evaluations. Similarly, Weisberg, Sexton, Mulhern, and Keeling (2009), in The Widget Effect,found that fewer than 1% of 15,000 teachers in 12 districts and four states were rated “unsatisfactory” and that little action was taken based on results from teacher evaluations. The authors argued that districts were treating teachers as widgets, or interchangeable parts in a system, not as individual professionals with the potential to have an important impact on instructional effectiveness and student outcomes.
This increased concern about how teacher evaluations were being conducted and used, along with legislation around teacher quality, focused state legislature attention on teacher evaluation (Goe, Holdheide, & Miller, 2011). The current conversation still focuses on how teacher evaluations are conducted; the impact of teacher evaluation on teacher effectiveness and student outcomes; and how results are used, for example, in professional development (Sawchuk, 2015).
Relevant Issues in Teacher Evaluation
Current issues in teacher evaluation revolve around core questions on how to design and implement an evaluation, including what framework to use, what to measure, and how to collect data.
Framework
A framework outlines the guiding principles for a teacher evaluation. It provides credibility in the system, and assurance that evaluators can confidently ascertain the quality of teachers (Danielson, 2010). That framework should include:
- A clear definition of good teaching that is agreed on by everyone involved (Danielson, 2010).
- An understanding of the purpose of the evaluation, which may be information gathering, accountability, or improvement, or any combination of the three (Goe et al., 2008).
- A clear purpose that provides information about whether the evaluation is formative or summative, and how the results will be used (Goe et al., 2008).
- An understanding of who is involved and how, the tools that will be used, and the stakeholders involved (Santiago & Benavides, 2009).
Measurement
Teacher quality is measured both quantitatively (e.g., student test scores) and qualitatively (e.g., notes on teacher professionalism). An analysis of 120 studies (Goe et al., 2008) identified qualitative elements of effective teachers:
- Positive contribution to academic, attitudinal, and social outcomes for students
- Comprehensive lesson planning, progress monitoring, and instruction adaption and evaluation capacity
- Diversity and civic-mindedness
- Collaboration with stakeholders (e.g., parents, administrators), particularly for students who are at risk (e.g., those with individualized education programs, or IEPs)
Once the elements that will be measured are clear, how to measure each aspect must be considered. While summative evaluations should include a comprehensive variety of measures that can provide a full picture of a teacher’s effectiveness, formative evaluations may include any range of measures used to collect enough information to serve the purpose of the evaluation. The measures used in formative evaluation may also be more teacher focused, including self-assessment, observation, peer mentoring, and coaching. When coaching and peer mentoring are used, it is important to consider training evaluators in how to deliver feedback that leads to improved teacher performance.
Another consideration for measurement is the reliability and validity of tools. Reliability of a tool is how well it produces consistent and stable results. Tools that are used to measure teacher practices must be reliable and valid; they must provide information that is consistent across multiple evaluators and that measure teacher practice without measuring any other factors at the same time. Also, tools used to gauge student outcomes must be valid, meaning that the scores must accurately measure the outcome without measuring anything else (Goe et al., 2008).
Blanton et al. (2003) outlined additional criteria that inform the usefulness of a measurement tool:
- The ability to capture all aspects of a teacher’s effectiveness
- The ability to capture the range of activities in a teacher’s work
- Usefulness of the scores to be used for a specific purpose
- Feasibility, including the cost, training required, and other considerations
- Credibility or the trust that the stakeholders have in the measure
Charlotte Danielson Framework for Teaching.
A common measure used for teacher evaluation is the Charlotte Danielson Framework for Teaching (Danielson, 1996, 2007), which includes an extensive rubric over four domains: planning and preparation, classroom environment, instruction, and professional responsibilities. Across these four domains, the rubric incorporates 76 elements of teaching broken into four levels of performance (unsatisfactory, basic, proficient, and distinguished). Over time and two iterations (1996 and 2007), the Danielson framework has become the primary tool for capturing teaching and learning (Marzano et al., 2011). The Danielson Framework for Teaching (Danielson, 1996) was intended to do three things:
- Acknowledge the difficulty and complexity of teaching as a profession.
- Create a language for professional engagement.
- Provide a structure for teacher assessment and reflection.
Research conducted on the Danielson framework indicates acceptable reliability and validity (Lash, Tran, & Huang, 2016). When there is score variance, it is attributable to the teacher, not other variables (Kane & Staiger, 2012; Kane, Taylor, Tyler, & Wooten, 2011). This means that when a score differs from one evaluation to the next, such as when a teacher advances in the area of planning and preparation from fall to winter, the difference between the two scores occurs because the teacher changed his or her practice, not because the tool was unclear. The reliability of achievement growth scores varies (Kane & Staiger, 2012; Lash et al., 2016). One study that used evaluations from 156 teachers across 18 high-poverty charter schools in the mid-Atlantic concluded that using multiple measures across a school year (in this case, three separate observations using the Danielson framework) provided a reliable measure (Kettler & Reddy, 2017).
Value-Added Measures
Value-added measures are a way to take into account the various conditions and factors that contribute to student achievement, across multiple years of teaching, and in comparison with other teachers . This way of calculating a teacher’s effectiveness was developed in the 2000s using statistical models that could determine how much one teacher contributed to student learning (Goe et al., 2008).
Because they are removed from the immediate classroom experience and seem disconnected from what happens in classrooms, value-added measures are controversial (Goe et al., 2008). However, these measures do have reliability. A study by the Bill and Melinda Gates Foundation (2010) found that teachers whose students showed gains in one assessment were likely to show gains in related assessments that measured conceptual understanding. For example, a math teacher whose students scored high on the state math assessment was likely to have students who also demonstrated a deep knowledge of the core principles of math. The correlation between teacher value-added measures on state tests and deeper understanding were higher for math (0.54) than for reading (0.37). However, it is important to consider that teachers who produce strong value-added scores on state tests may also develop students’ overarching skills and depth of knowledge about the subject.
As a summative measure, value-added measures provide an overarching look at a teacher’s impact over time. Yet, as a formative tool, value-added measures do not provide information about what high-performing teachers do that make a difference in student learning (Goe et al., 2008). While value-added models are useful for identifying trends that can be used to make system improvements, multiple reports have recommended against using them for individual personnel decisions (American Statistical Association, 2014; Darling-Hammond et al., 2012; Polikoff & Porter, 2014). Specifically, the American Statistical Association cautioned against using value-added measures because, among other reasons, they are based on only one measure (standardized test scores), and the models may not capture all the factors that contribute to the effect a teacher may have on student outcomes.
Continuum of Research and Impact on Student Outcomes
Teacher evaluation is an established practice directed by state and federal law. However, we do not know the exact or full impact of teacher evaluation practices on student outcomes (e.g., Stecher et al., 2018). Some research has attempted to connect the practice of teacher evaluation with changes in student outcomes. In three notable large-scale studies, teacher evaluation was the practice of assessing teachers using a valid and reliable tool and providing feedback. These studies produced mixed results on student or school-level outcomes.
A quasi-experimental study of mid-career elementary and middle school teachers in the Cincinnati Public Schools Teacher Evaluation System (TES) examined teachers before, during, and after a year-long evaluation. The 105 teachers involved in the study taught fourth- through eighth-grade math. Evaluations conducted using multiple, structured classroom observations by trained peers and administrators were conducted between the 2003–2004 and 2009–2010 school years. The observations were conducted using a rubric based on the Danielson Framework for Teaching (Danielson, 1996, 2007). Student achievement was compared before, during, and after the teacher’s evaluation year. Teachers were more effective in advancing student achievement in math the year they were evaluated and the years afterward. Specifically, a student who was taught by a teacher who had been through TES scored 11% of a standard deviation (4.5 percentile points for a median student) higher in math compared with a student taught by the same teacher before the evaluation. The study did not identify what about teacher practice accounted for the difference in student achievement. This study supports the use of teacher evaluation to encourage continued growth in mid-career teachers’ performance and a connection to student achievement. Also, performance improvement was greatest for teachers who were weakest at the start of the evaluation (those who received low initial scores or who were ineffective in improving student test scores the year prior to evaluation). Teacher evaluation was a way for teachers who needed the most support, those that scored the lowest on initial evaluations and likely received the most critical feedback, to receive development (Taylor & Tyler, 2012a, 2012b).
In another large-scale study, the Chicago Public Schools’ Excellence in Teaching Project was a teacher evaluation program focused on increasing student learning through principal-teacher conversation. A pilot study included 44 elementary schools in 2008–2009 and an additional 48 schools in 2009–2010. Principals in the first cohort received a total of 50 hours of support across the school year, with training and development in the Danielson framework, best practices in teacher observation and evidence collection, coaching, and implementation. Principals who joined the project in the second year received significantly less support. This difference in support across the two cohorts may have impacted the results. Short-term positive effects on reading performance were found in high-achieving, low-poverty schools, and schools that were in the first cohort performed higher in reading and math than schools in the second cohort. This study suggests that teacher evaluation systems produce different effects at different schools, and that teacher observation can have an impact on school performance (Steinberg & Sartain, 2015).
The Gates Foundation has been extensively involved in teacher evaluation as it relates to student achievement outcomes (Barnum, 2018). In 2018, the Gates Foundation released a cumulative study that reflected its work in three districts (Stecher et al., 2018). The Intensive Partnerships for Effective Teaching initiative was focused on increasing student performance by improving teaching effectiveness. The project started in 2009–2010 in three school districts (Hillsborough County Public Schools in Florida, Memphis City Schools, and Pittsburgh Public Schools) and four charter management organizations. Across multiple years, teaching effectiveness measures collected using a rubric were used to improve staffing, identify areas of development, strengthen professional development, and structure teacher advancement and compensation. The researchers hypothesized that with a strong teaching effectiveness evaluation system in place, teaching quality would increase and lead to greater academic outcomes for students in low-income, minority schools. The final report (Stecher et al., 2018) noted that school sites had implemented the teacher effectiveness practices (evaluation using an observation rubric and subsequent decision-making), but the advancement in student achievement or graduation rates was not realized, particularly for low-income minority students. At the end of the project (2014–2015), student achievement, access to effective teaching, and graduation rates in sites that had participated in the initiative did not differ from those in sites that had not participated. The reason why there was no difference was unclear, although the researchers hypothesized that a focus exclusively on teacher effectiveness may not be enough to improve student outcomes and that other factors may need to be addressed to produce dramatic improvements in student outcomes.
Implications
Teacher evaluation is a best practice that can be used to inform decisions when implemented with transparent processes and strong measures. The process of teacher evaluation produces some change in teacher practice that can impact student outcomes during and after the evaluation period (Taylor & Tyler, 2012a, 2012b). However, teacher evaluation may have different impacts on schools with varying demographics and baseline achievement levels (Steinberg & Sartain, 2015). Finally, formative evaluation can provide clear, objective feedback and a structure for collecting and using data to show teachers how they are changing performance, and, in that way, serve as professional development to support low-performing teachers (Taylor & Tyler, 2012a, 2012b).
Cost-Benefit of Teacher Evaluation.
The cost-benefit of teacher evaluation encompasses many considerations including student learning outcomes, information gathered, and the ability to make decisions with the information (Peterson, 2000). It is likely that the benefits and costs will be specific to a school or district.
For example, one study of the cost to start a teacher evaluation system across three districts found that it ranged from $8 to $115 per student, which equated to between 0.4% and 0.5% of total district spending, and between 1% and 1.3% of teacher compensation (Chambers, Brodziak de los Reyes, & O’Neil, 2013). The researchers concluded that their figures did not reflect all potential costs and that the cost of actual implementation might be higher.
Conclusion
Currently, teacher evaluation is understood as a form of professional development. The goal is to establish a rigorous and fair system that can be used to make decisions related to hiring, firing, and promotion, and that can improve teacher practice and student learning (Bill and Melinda Gates Foundation, 2012). This is no easy task as evidenced by the mixed results for large-scale studies that have examined the impact of teacher evaluation on student achievement (Stecher et al., 2018; Steinberg & Sartain, 2015; Taylor & Tyler, 2012a, 2012b).
As a practice, teacher evaluation is an established way to gather information about how teachers are performing in the classroom and is already incorporated into the expectations and day-to-day work of school administrators. With current measures (e.g., the Danielson Framework for Teaching), it is possible to collect reliable and valid data related to teacher performance and use that data to design professional development targeted at teacher needs. With rigorous measures and quality implementation, teacher evaluation, especially formative evaluation, is a tool that, ideally, can be used to improve teacher quality over time.
Citations
American Statistical Association. (2014, April 8). ASA statement on using value-added models for educational assessment. Retrieved from https://www.scribd.com/document/217916454/ASA-VAM-Statement-1
Barnum, M. (2018, June 21). The Gates Foundation bet big on teacher evaluation. The report it commissioned explains how those efforts fell short. Chalkbeat.Retrieved from https://www.chalkbeat.org/posts/us/2018/06/21/the-gates-foundation-bet-big-on-teacher-evaluation-the-report-it-commissioned-explains-how-those-efforts-fell-short/
Bill and Melinda Gates Foundation. (2010). Learning about teaching: Initial findings from the measures of effective teaching project.Retrieved from https://docs.gatesfoundation.org/documents/preliminary-findings-research-paper.pdf
Bill and Melinda Gates Foundation. (2012). Gathering feedback on teaching: Combining high-quality observation with student surveys and achievement gains.Retrieved from http://k12education.gatesfoundation.org/resource/gathering-feedback-on-teaching-combining-high-quality-observations-with-student-surveys-and-achievement-gains-2/
Blanton, L. P., Sindelar, P. T., Correa, V., Harman, M., McDonnell, J., & Kuhel, K. (2003). Conceptions of beginning teacher quality: Models for conducting research(COPSSE Doc. No. RS-6). Gainesville, FL: Center on Personnel Studies in Special Education (COPSSE), University of Florida. Retrieved from http://copsse.education.ufl.edu//docs/RS-6/1/RS-6.pdf
Chambers, J., Brodziak de los Reyes, I., & O’Neil, C. (2013). How much are districts spending to implement teacher evaluation systems? Case studies of Hillsborough County Public Schools, Memphis City Schools, and Pittsburgh Public Schools. Santa Monica, CA: RAND Corporation. Retrieved from: https://www.rand.org/content/dam/rand/pubs/working_papers/WR900/WR989/RAND_WR989.pdf
Danielson, C. (1996, 2007). Enhancing professional practice: A framework for teaching (1st and 2nd eds).Alexandria, VA: ASCD.
Danielson, C. (2010). Evaluations that help teachers learn. Educational Leadership, 68(4), 35–39. Retrieved from http://www.ascd.org/publications/educational-leadership/dec10/vol68/num04/Evaluations-That-Help-Teachers-Learn.aspx
Darling-Hammond, L., Amrein-Beardsley, A., Haertel, E., & Rothstein, J. (2012). Evaluating teacher evaluation: Popular modes of evaluating teachers are fraught with inaccuracies and inconsistencies, but the field has identified better approaches. Phi Delta Kappan, 93(6), 8–15.Retrieved from https://www.edweek.org/ew/articles/2012/03/01/kappan_hammond.html
Goe, L., Bell, C., & Little, O. (2008). Approaches to evaluating teacher effectiveness: A research synthesis. Washington, DC: National Comprehensive Center for Teacher Quality. Retrieved from https://eric.ed.gov/?id=ED521228
Goe, L., Holdheide, L., & Miller, T. (2011). A practical guide to designing comprehensive teacher evaluation systems: A tool to assist in the development of teacher evaluation systems.Washington, DC: National Comprehensive Center for Teacher Quality. Retrieved from https://files.eric.ed.gov/fulltext/ED520828.pdf
Hunter, M. (1984). Knowing, teaching, and supervising. In P. Hosford (Ed.), Using what we know about teaching.(pp. 169–192). Alexandria, VA: ASCD.
Johnson, E. G., & Zwick, R. (1990). Focusing the new design: The NAEP 1988 technical report. Journal of Educational and Behavioral Studies, 17,95–109.
Kane, T. J., & Staigler, D. O. (2012). Gathering feedback for teaching: Combining high-quality observations with student surveys and achievement gains.Seattle, WA: Bill and Melinda Gates Foundation.
Kane, T. J., Taylor, E. S., Tyler, J. H., & Wooten, A. L. (2011). Identifying effective classroom practices using achievement data. Journal of Human Resources, 46(3), 587–613.
Kettler, R. J., & Reddy, L. A. (2017). Using observational assessment to inform professional development decisions: Alternative scoring for the Danielson Framework for Teaching. Assessment for Effective Intervention,1–12.
Lash, A., Tran, L., & Huang, M. (2016). Examining the validity of ratings from a classroom observation instrument for use in a district’s teacher evaluation system(REL 2016-135). Washington, DC: U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance, Regional Educational Laboratory West.
Marzano, R. J. (2012). Teacher Evaluation: What’s fair? What’s effective? The two purposes of teacher evaluation. Educational Leadership, 70(3), 14–19. Alexandria, VA: ASCD. Retrieved from http://www.ascd.org/publications/educational-leadership/nov12/vol70/num03/The-Two-Purposes-of-Teacher-Evaluation.aspx
Marzano, R., Frontier, T., & Livingston, D. (2011). Effective supervision: Supporting the art and science of teaching. Alexandria, VA: ASCD.
Nye, B., Konstantopoulos, S., & Hedges, L. V. (2004). How large are teacher effects? Educational Evaluation and Policy Analysis, 26(3),237–257.
Peterson, K. D. (2000). Teacher evaluation: A comprehensive guide to new directions and practices(2nd ed.).Thousand Oaks, CA: Corwin Press.
Polikoff, M. S, & Porter, A. C. (2014). Instructional alignment as a measure of teacher quality. Education Evaluation and Policy Analysis, 64(3), 212–225. Retrieved from http://www.aera.net/Newsroom/Recent-AERA-Research/Instructional-Alignment-as-a-Measure-of-Teaching-Quality
Sanders, W. L., Wright, S. P., & Horn, S. P. (1997). Teacher and classroom context effects on student achievement: Implications for teacher evaluation. Journal of Personnel Evaluation and Education, 11(1), 57–67.
Santiago, P., & Benavides, F. (2009). Teacher evaluation: A conceptual framework and examples of country practices.Organisation for Economic Cooperation and Development (OECD). Retrieved from http://www.oecd.org/education/school/44568106.pdf
Sawchuk, S. (2015, September 3). Teacher Evaluation: An issue overview. Education Week. Retrieved from www.edweek.org/ew/section/multimedia/teacher-performance-evaluation-issue-overview.html
Sayavedra, M. (2014). Teacher evaluation. ORTESOL Journal, 31, 1–9.
Stecher, B. M., Holtzman, D. J., Garet, M. S., Hamilton, L. S., Engberg, J., Steiner, E. D.,…Chambers, J. (2018).Improving teaching effectiveness: Final report: The intensive partnerships for effective teaching through 2015–2016.Santa Monica, CA: RAND Corporation. Retrieved from https://www.rand.org/pubs/research_reports/RR2242.html
Steinberg, M. P., & Sartain, L. (2015). Does teacher evaluation improve school performance? Experimental evidence from Chicago’s Excellence in Teaching project. Education Finance and Policy, 10(4), 535–572.
Taylor, E. S., & Tyler, J. H. (2012a). Can teacher evaluation improve teaching? Evidence of systematic growth in the effectiveness of midcareer teachers. Education Next, 12(4). Retrieved from http://educationnext.org/can-teacher-evaluation-improve-teaching/
Taylor, E. S., & Tyler, J. H. (2012b). The effect of evaluation on teacher performance. American Economic Review, 102(7), 3628–3651.
Toch, T., & Rothman, R. (2008). Rush to judgment: Teacher evaluation in public education.Washington, DC: Education Sector.Retrieved from https://eric.ed.gov/?id=ED502120
Weisberg, D., Sexton, S., Mulhern, J., & Keeling, D. (2009). The widget effect: Our national failure to acknowledge and act on differences in teacher effectiveness. New York, NY: The New Teacher Project. Retrieved from https://tntp.org/publications/view/the-widget-effect-failure-to-act-on-differences-in-teacher-effectiveness
Wenglinsky, H. (2002). The link between teacher classroom practices and student academic performance. Education Policy Analysis Archives, 10(12).
Wise, A. E., Darling-Hammond, L., Tyson-Bernstein, H, & McLaughlin, M. W. (1984). Teacher evaluation: A study of effective practices. Santa Monica, CA: RAND Corporation. Retrieved from https://www.rand.org/pubs/reports/R3139.html
TITLE
SYNOPSIS
CITATION
LINK
Teachers’ subject matter knowledge as a teacher qualification: A synthesis of the quantitative literature on students’ mathematics achievement
The main focus of this study is to find different kinds of variables that might contribute to variations in the strength and direction of the relationship by examining quantitative studies that relate mathematics teachers’ subject matter knowledge to student achievement in mathematics.
Ahn, S., & Choi, J. (2004). Teachers' Subject Matter Knowledge as a Teacher Qualification: A Synthesis of the Quantitative Literature on Students' Mathematics Achievement. Online Submission.
Teachers' Subject Matter Knowledge as a Teacher Qualification: A Synthesis of the Quantitative Literature on Students' Mathematics Achievement
The aim of this paper is to examine a variety of features of research that might account for mixed findings of the relationship between teachers' subject matter knowledge and student achievement based on meta-analytic technique.
Ahn, S., & Choi, J. (2004). Teachers' Subject Matter Knowledge as a Teacher Qualification: A Synthesis of the Quantitative Literature on Students' Mathematics Achievement. Online Submission.
Pushing the horizons of student teacher supervision: Can a bug-in-ear system be an effective plug-and-play tool for a novice electronic coach to use in student teacher supervision? ProQuest Dissertations and Theses.
This case study explored the use of the Bug-in-Ear (BIE) tool for undergraduate student-teacher supervision in the hands of a novice BIE2 coach, including the ease with which BIE equipment can be set up and operated by a novice coach and naïve users in the classroom.
Almendarez, M. B., Zigmond, N., Hamilton, R., Lemons, C., Lyon, S., McKeown, M., Rock, M. (2012). Pushing the horizons of student teacher supervision: Can a bug-in-ear system be an effective plug-and-play tool for a novice electronic coach to use in student teacher supervision? ProQuest Dissertations and Theses.
Not Prepared for Class: High-Poverty Schools Continue to Have Fewer In-Field Teachers.
As Secretary of Education from 1993 to 2001, Richard Riley had serious concerns about out-of-field teaching. The practice— which places in core academic classes instructors who have neither certification nor a major in the subject field taught— just didn’t make sense to him.
Almy, S., & Theokas, C. (2010). Not Prepared for Class: High-Poverty Schools Continue to Have Fewer In-Field Teachers. Education Trust.
ASA statement on using value-added models for educational assessment
Value-Added Models (VAMs) has been embraced by many states and school districts as part of educational accountability systems. Value-Added Assessment (VAA) Models attempt to estimate effects of individual teachers or schools on student achievement while accounting for differences in student background. This paper provides a summary of the American Statistical Associations analysis of the efficacy of value-added modeling in education.
American Statistical Association. (2014). ASA statement on using value-added models for educational assessment. Alexandria, VA.
Relationships among teachers and students’ thinking skills, sense of efficacy and student achievement
Examined relationships between and among teachers' and students' sense of efficacy, thinking skills, and student achievement. Teachers were interviewed at the beginning and end of the year. Relationships among student thinking, efficacy, and achievement were clearly demonstrated.
Anderson, R. N., Greene, M. L., & Loewen, P. S. (1988). Relationships among teachers' and students' thinking skills, sense of efficacy, and student achievement. Alberta Journal of Educational Research.
Teacher evaluations: What is the issue and why does it matter? Policy snapshot
A report by TNTP finds 99 percent of teachers are rated good or great, confirming related findings that evaluation systems are not meaningfully differentiating teachers or providing useful feedback. TNTP recommends states use student growth as one measure of teacher effectiveness.
Aragon, S. (2018). Teacher Evaluations: What Is the Issue and Why Does It Matter? Policy Snapshot. Education Commission of the States.
Evaluating the impact of performance-related pay for teachers in England.
This paper evaluates the impact of a performance-related pay scheme for teachers in England.
Atkinson, A., Burgess, S., Croxson, B., Gregg, P., Propper, C., Slater, H., & Wilson, D. (2009). Evaluating the impact of performance-related pay for teachers in England. Labour Economics, 16(3), 251-261.
Problems with the use of student test scores to evaluate teachers
There is also little or no evidence for the claim that teachers will be more motivated to improve student learning if teachers are evaluated or monetarily rewarded for student test score gains.
Baker, E. L., Barton, P. E., Darling-Hammond, L., Haertel, E., Ladd, H. F., Linn, R. L., ... & Shepard, L. A. (2010). Problems with the Use of Student Test Scores to Evaluate Teachers. EPI Briefing Paper# 278. Economic Policy Institute.
The Gates Foundation bet big on teacher evaluation. The report it commissioned explains how those efforts fell short.
Bad teachers were the problem; good teachers were the solution. It was a simplified binary, but the idea and the research it drew on had spurred policy changes across the country, including a spate of laws establishing new evaluation systems designed to reward top teachers and help weed out low performers. Behind that effort was the Bill and Melinda Gates Foundation, which backed research and advocacy that ultimately shaped these changes.
Barnum, M. (2018, June 21). The Gates Foundation bet big on teacher evaluation. The report it commissioned explains how those efforts fell short. Chalkbeat.Retrieved from https://www.chalkbeat.org/posts/us/2018/06/21/the-gates-foundation-bet-big-on-teacher-evaluation-the-report-it-commissioned-explains-how-those-efforts-fell-short/
Enhancing Adherence to a Problem Solving Model for Middle-School Pre-Referral Teams: A Performance Feedback and Checklist Approach
This study looks at the use of performance feedback and checklists to improve middle-school teams problem solving.
Bartels, S. M., & Mortenson, B. P. (2006). Enhancing adherence to a problem-solving model for middle-school pre-referral teams: A performance feedback and checklist approach. Journal of Applied School Psychology, 22(1), 109-123.
Learning about teaching: Initial findings from the measures of effective teaching project
In fall 2009, the Bill & Melinda Gates Foundation launched the Measures of Effective Teaching (MET) project to test new approaches to measuring effective teaching. The goal of the MET project is to improve the quality of information about teaching effectiveness available to education professionals within states and districts.
Bill and Melinda Gates Foundation. (2010). Learning about teaching: Initial findings from the measures of effective teaching project.Retrieved from https://docs.gatesfoundation.org/documents/preliminary-findings-research-paper.pdf
Conceptions of beginning teacher quality: Models for conducting research
In this paper, we consider traditions of research on teaching and how conceptions of good teaching evolved as traditions changed.
Blanton, L. P., Sindelar, P. T., Correa, V., Harman, M., McDonnell, J., & Kuhel, K. (2003). Conceptions of beginning teacher quality: Models for conducting research(COPSSE Doc. No. RS-6). Gainesville, FL: Center on Personnel Studies in Special Education (COPSSE), University of Florida. Retrieved from http://copsse.education.ufl.edu//docs/RS-6/1/RS-6.pdf
Houston ties teachers’ pay to test scores.
Over the objection of the teachers' union, the Board of Education here on Thursday unanimously approved the nation's largest merit pay program, which calls for rewarding teachers based on how well their students perform on standardizes tests.
Blumenthal, R. (2006). Houston ties teachers’ pay to test scores. New York Times, 13.
Improving schools by standardized tests
This book divides itself naturally into two parts. The first part has to do with the situation in which Superintendent Brooks found himself, with his successful campaign in educating his teachers to use standardized tests, with the results which he obtained, with the way he used these results to grade his pupils, to rate his teachers, and to evaluate methods of teaching, and finally with the use he made of intelligence tests.
Brooks, S. S. (1905). Improving schools by standardized tests. Houghton Mifflin.
Do Principals Know Good Teaching When They See It?
This article examines the effectiveness and related issues of current methods of principal evaluation of teachers.
Burns M. (2011). Do Principals Know Good Teaching When They See It?. Educational policy, 19(1), 155-180.
How much are districts spending to implement teacher evaluation systems: Case studies of Hillsborough County Public Schools, Memphis City Schools, and Pittsburgh Public Schools.
This report presents case studies of the efforts by three school districts, Hillsborough County Public Schools (HCPS), Memphis City Schools (MCS), and Pittsburgh Public Schools (PPS), to launch, implement, and operate new teacher evaluation systems as part of a larger reform effort called the Partnership Sites to Empower Effective Teaching.
Chambers, J., Brodziak de los Reyes, I., & O'Neil, C. (2013). How Much are Districts Spending to Implement Teacher Evaluation Systems?.
The Long-Term Impacts Of Teachers: Teacher Value-Added And Student Outcomes In Adulthood
This paper examines the issue of efficacy of value-added measures in evaluating teachers. This question is important in understanding whether value-added analysis provides unbiased estimates of teachers’ impact on student achievement and whether these teachers improve long-term student outcomes.
Chetty, R., Friedman, J. N., & Rockoff, J. E. (2011). The long-term impacts of teachers: Teacher value-added and student outcomes in adulthood (No. w17699). National Bureau of Economic Research.
School-wide benchmarks of quality (Revised). Unpublished instrument
The School-Wide Benchmarks of Quality (BoQ) was initially developed and validated in 2005 to address the need for an efficient method of measuring implementation of school-wide PBS that would also provide feedback to guide teams toward higher levels of implementation. Over the last 5 years the exposure and use of the instrument has increased.
Childs, K. E., Kincaid, D., & George, H. P. (2011). The revised school-wide PBS Benchmarks of Quality (BoQ). OSEP Technical Assistance Center on Positive Behavioral Interventions and Supports.
Overview of Teacher Evaluation
This overview provides information about teacher evaluation as it relates to collecting information about teacher practice and using it to improve student outcomes. The history of teacher evaluation and current research findings and implications are included.
Cleaver, S., Detrich, R. & States, J. (2018). Overview of Teacher Evaluation. Oakland, CA: The Wing Institute. https://www.winginstitute.org/quality-teachers-evaluation.
Overview: Formal Teacher Evaluation
The purpose of this overview is to provide information about the role of formal teacher evaluation, the research that examines the practice, and its impact on student outcomes.
Performance Feedback Overview
This overview examines the current understanding of research on performance feedback as a way to improve teacher performance and student outcomes.
Cleaver, S., Detrich, R. & States, J. (2019). Overview of Performance Feedback. Oakland, CA: The Wing Institute. https://www.winginstitute.org/teacher-evaluation-feedback.
Effects of immediate performance feedback on implementation of behavior support plans, 2005
The purpose of this study is to examine the effects of feedback on treatment integrity for implementing behavior support plans.
Codding, R. S., Feinberg, A. B., Dunn, E. K., & Pace, G. M. (2005). Effects of immediate performance feedback on implementation of behavior support plans. Journal of Applied Behavior Analysis, 38(2), 205-219.
An Evaluation of Teachers Trained Through Different Routes to Certification, Final Report
The study compares the effectiveness of different routes to teaching. It finds there is no significant difference in the effectiveness of teachers who were traditionally trained when compared to teachers who obtained training through alternative credential programs.
Constantine, J., D. Player, T. Silva, K. Hallgren, M. Grider, and J. Deke, 2009. An Evaluation of Teachers Trained Through Different Routes to Certification, Final Report (NCEE 2009- 4043). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.
Applied Behavior Analysis
This book is a comprehensive description of the principles and procedures for systematic change of socially significant behavior. It includes basic principles, applications, and behavioral research methods.
Cooper, J. O., Heron, T. E., & Heward, W. L. (2007). Applied behavior analysis.
Can teachers be evaluated by their students’ test scores? Should they be? The use of value-added measures for teacher effectiveness in policy and practice
In this report, the author aim to provide an accessible introduction to these new measures of teaching quality and put them into the broader context of concerns over school quality and achievement gaps.
Corcoran, S. P. (2010). Can Teachers Be Evaluated by Their Students' Test Scores? Should They Be? The Use of Value-Added Measures of Teacher Effectiveness in Policy and Practice. Education Policy for Action Series. Annenberg Institute for School Reform at Brown University (NJ1).
Performance Feedback in Education: On Who and For What
This paper reviews the importance of feedback in education reviewed the scientific model of behavior change (antecedent, behavior, consequences).
Daniels, A. (2013). Feedback in Education: On Whom and for What. In Performance Feedback: Using Data to Improve Educator Performance (Vol. 3, pp. 77-95). Oakland, CA: The Wing Institute.
Enhancing Professional Practice: A Framework for Teaching
The framework for teaching is a research-based set of components of instruction that are grounded in a constructivist view of learning and teaching. The framework defines four levels of performance--Unsatisfactory, Basic, Proficient, and Distinguished--for each element, providing a valuable tool that all teachers can use.
Danielson, C. (2007). Enhancing professional practice: A framework for teaching. ASCD.
Evaluations that help teachers learn.
This article addresses the topics of staff assessment, teacher supervision, and professional development.
Danielson, C. (2011). Evaluations that help teachers learn. Educational leadership, 68(4), 35-39.
Evaluating teacher evaluation: Popular modes of evaluating teachers are fraught with inaccuracies and inconsistencies
Popular modes of evaluating teachers are fraught with inaccuracies and inconsistencies, but the field has identified better approaches. Value-added models enable researchers to use statistical methods to measure changes in student scores over time while considering student characteristics and other factors often found to influence achievement.
Darling-Hammond, L., Amrein-Beardsley, A., Haertel, E., & Rothstein, J. (2012). Evaluating teacher evaluation: Popular modes of evaluating teachers are fraught with inaccuracies and inconsistencies, but the field has identified better approaches. Phi Delta Kappan, 93(6), 8–15.Retrieved from https://www.edweek.org/ew/articles/2012/03/01/kappan_hammond.html
What research says about using value-added measures to evaluate teachers.
A growing number of researchers are studying whether value-added measures can do a good job of measuring the contribution of teachers to test score growth. Here I summarize a handful of analyses that shed light on two questions.
David, J. L. (2010). What research says about using value-added measures to evaluate teachers. Educational Leadership, 67(8), 81–82. Retrieved from http://www.ascd.org/publications/educational_leadership/may10/vol67/num08/Using_Value-Added_Measures_to_Evaluate_Teachers.aspx
Distributed Leadership
This review summarizes the evidence for the model’s efficacy in explaining how principals and teachers together influence school practices and effectiveness.
Donley, J., Detrich, R., States, J., & Keyworth, (2020). Distributed Leadership. Oakland, CA: The Wing Institute. https://www.winginstitute.org/leadership-models-distributed
How do principals really improve schools?
Principals are in a paradoxical position. On one hand, they're called on to use research-based strategies to improve student achievement. On the other, they're increasingly required to micromanage teachers by observing in classrooms and engaging in intensive evaluation. The authors point out that these two positions are at odds with each other.
Dufour, R., & Mattos, M. (2013). How Do Principals Really Improve Schools?. Educational Leadership, 70(7), 34-40.
Leading for Instructional Improvement: How Successful Leaders Develop Teaching and Learning Expertise
This book shows how principals and other school leaders can develop the skills necessary for teachers to deliver high quality instruction by introducing principals to a five-part model of effective instruction.
Fink, S., & Markholt, A. (2011). Leading for instructional improvement: How successful leaders develop teaching and learning expertise. John Wiley & Sons.
Teacher Incentives and Student Achievement: Evidence from New York City Public Schools
This article describes a school-based randomized trial in over 200 New York City public schools designed to better understand the impact of teacher incentives.
Fryer, R. G. (2013). Teacher incentives and student achievement: Evidence from New York City public schools. Journal of Labor Economics, 31(2), 373-407.
Approaches to Evaluating Teacher Effectiveness: A Research Synthesis
This research synthesis examines how teacher effectiveness is currently measured (i.e., formative vs. summative evaluation).
Goe, L., Bell, C., & Little, O. (2008). Approaches to Evaluating Teacher Effectiveness: A Research Synthesis. National Comprehensive Center for Teacher Quality.
A Practical Guide to Designing Comprehensive Teacher Evaluation Systems: A Tool to Assist in the Development of Teacher Evaluation Systems
This guide is a tool designed to assist states and districts in constructing high-quality teacher evaluation systems in an effort to improve teaching and learning.
Goe, L., Holdheide, L., & Miller, T. (2011). A Practical Guide to Designing Comprehensive Teacher Evaluation Systems: A Tool to Assist in the Development of Teacher Evaluation Systems. National Comprehensive Center for Teacher Quality.
In school, teacher quality matters most. Education Next
FIFTY YEARS after the release of "Equality of Educational Opportunity"--widely known as the Coleman Report--much of what James Coleman and his colleagues reported holds up well to scrutiny. It is, in fact, remarkable to read through the 700-plus pages and see how little has changed about what the empirical evidence says matters. The report's conclusions about the importance of teacher quality, in particular, have stood the test of time, which is noteworthy, given that today's studies of the impacts of teachers use more-sophisticated statistical methods and employ far better data.
Goldhaber, D. (2016). In schools, teacher quality matters most: today's research reinforces Coleman's findings. Education Next, 16(2), 56-63.
Is this just a bad class? Assessing the stability of measured teacher performance
This paper report on work estimating the stability of value-added estimates of teacher effects, an important area of investigation given that new workforce policies implicitly assume that effectiveness is a stable attribute within teachers.
Goldhaber, D. D., & Hansen, M. (2008). Is it Just a Bad Class?: Assessing the Stability of Measured Teacher Performance. Seattle, WA: Center on Reinventing Public Education.
Uneven playing field? Assessing the teacher quality gaps between advantaged and disadvantaged students.
This study presents a comprehensive, descriptive analysis of the inequitable distribution of both input and output measures of teacher quality across various indicators of student disadvantage across all school districts in Washington State.
Goldhaber, D., Lavery, L., & Theobald, R. (2015). Uneven playing field? Assessing the teacher quality gaps between advantaged and disadvantaged students. Educational Researcher, 44(5), 293–307.
Identifying effective teachers using performance on the job
This paper provide some recommendations to increase the pool of potential teachers, make it tougher to award tenure to those who perform least well, and reward effective teachers who are willing to work in schools serving large numbers of low-income, disadvantaged children.
Gordon, R., Kane, T. J., & Staiger, D. O. (2006). Identifying Effective Teachers Using Performance on the Job. The Hamilton Project Policy Brief No. 2006-01. Brookings Institution.
Undue process: Why bad teachers in twenty-five diverse districts rarely get fired
Is dismissing poorly performing teachers truly feasible in America today? After all the political capital (and real capital) spent on reforming teacher evaluation, can districts actually terminate ineffective teachers who have tenure or have achieved veteran status?
Griffith, D., & McDougald, V. (2016). Undue process: Why bad teachers in twenty-five diverse districts rarely get fired. Washington DC: Thomas B. Fordham Institute. Retrieved from http://edex. s3-us-west-2. amazonaws. com/publication/pdfs, 2812, 29.
Principal effectiveness and principal turnover.
This study investigate the association between principal effectiveness and principal turnover using longitudinal data from Tennessee, a state that has invested in multiple measures of principal performance through its educator evaluation system.
Grissom, J. A., & Bartanen, B. (2019a). Principal effectiveness and principal turnover. Education Finance and Policy, 14(3), 355–382. Retrieved from https://www.mitpressjournals.org/doi/full/10.1162/edfp_a_00256
Effective Instructional Time Use for School Leaders: Longitudinal Evidence from Observations of Principals
This study examines principals’ time spent on instructional functions. The results show that the traditional walk-through has little impact, but principals provide coaching, evaluation, and focus on educational programs can make a difference.
Grissom, J. A., Loeb, S., & Master, B. (2013). Effective Instructional Time Use for School Leaders: Longitudinal Evidence from Observations of Principals. Educational Researcher, 42(8), 433-444.
Supporting Principals in Implementing Teacher Evaluation Systems
With so much emphasis being placed on improving teacher performance, The National Association of Elementary School Principals and the National Association of Secondary School Principals have developed recommendations to support principals more effectively evaluate teachers.
Grissom, J. A., Loeb, S., & Master, B. (2013). Effective Instructional Time Use for School Leaders: Longitudinal Evidence from Observations of Principals. Educational Researcher, 42(8), 433-444.
Teacher characteristics and gains in student achievement: Estimation using micro-data.
The major objective of this data analysis was to estimate the relationship between variables which can be controlled by public policy and educational output.
Hanushek, E. A. (1971). Teacher characteristics and gains in student achievement: Estimation using micro data. American Economic Review, 61(2), 280-288.
Teacher Deselection.
This discussion provides a quantitative statement of one approach to achieving the governors’ (and the nation’s) goals – teacher deselection.
Hanushek, E. A. (2009). Teacher deselection. Creating a new teaching profession, 168, 172-173.
Skills, productivity, and the evaluation of teacher performance.
The authors examine the relationships between observational ratings of teacher performance, principals’ evaluations of teachers’ cognitive and non-cognitive skills and test-score based measures of teachers’ productivity.
Harris, D. N., & Sass, T. R. (2014). Skills, productivity and the evaluation of teacher performance. Economics of Education Review, 40, 183-204.
Visible learning
This influential book is the result of 15 years research that includes over 800 meta-analyses on the influences on achievement in school-aged students. This is a great resource for any stakeholder interested in conducting a serious search of evidence behind common models and practices used in schools.
Hattie, J. (2009). Visible learning. A synthesis of over, 800.
Visible Learning for Teachers: Maximizing Impact on Learning
This book takes over fifteen years of rigorous research into education practices and provides teachers in training and in-service teachers with concise summaries of the most effective interventions and offers practical guidance to successful implementation in classrooms.
Hattie, J. (2012). Visible learning for teachers: Maximizing impact on learning. Routledge.
The Power of Feedback
This paper provides a conceptual analysis of feedback and reviews the evidence related to its impact on learning and achievement.
Hattie, J., & Timperley, H. (2007). The power of feedback. Review of educational research, 77(1), 81-112.
Teacher evaluation as a policy target for improved student learning: A fifty-state review of statute and regulatory action since NCLB
This paper reports on the analysis of state statutes and department of education regulations in fifty states for changes in teacher evaluation in use since the passage of No Child Left Behind Act of 2001.
Hazi, H. M., & Rucinski, D. A. (2009). Teacher evaluation as a policy target for improved student learning: A fifty-state review of statute and regulatory action since NCLB. education policy analysis archives, 17, 5.
Impact of performance feedback delivered via electronic mail on preschool teachers’ use of descriptive praise.
This paper examined the effects of a professional development intervention that included data-based performance feedback delivered via electronic mail (e-mail) on preschool teachers’ use of descriptive praise and whether increased use of descriptive praise was associated with changes in classroom-wide measures of child engagement and challenging behavior.
Hemmeter, M. L., Snyder, P., Kinder, K., & Artman, K. (2011). Impact of performance feedback delivered via electronic mail on preschool teachers’ use of descriptive praise. Early Childhood Research Quarterly, 26(1), 96-109.
Learning from teacher observations: Challenges and opportunities posed by new teacher evaluation systems
This article discusses the current focus on using teacher observation instruments as part of new teacher evaluation systems being considered and implemented by states and districts.
Hill, H., & Grossman, P. (2013). Learning from teacher observations: Challenges and opportunities posed by new teacher evaluation systems. Harvard Educational Review, 83(2), 371-384.
Can Principals Identify Effective Teachers? Evidence on Subjective Performance Evaluation in Education
This paper examines how well principals can distinguish between more and less effective teachers. To put principal evaluations in context, we compare them with the traditional determinants of teacher compensation-education and experience-as well as value-added measures of teacher effectiveness.
Jacob, B. A., & Lefgren, L. (2008). Can principals identify effective teachers? Evidence on subjective performance evaluation in education. Journal of Labor Economics, 26(1), 101-136.
Teacher perspectives on evaluation reform: Chicago’s REACH students.
This study draws on 32 interviews from a random sample of teachers and 2 years of survey data from more than 12,000 teachers per year to measure their perceptions of the clarity, practicality, and cost of the new system.
Jiang, J. Y., Sporte, S. E., & Luppescu, S. (2015). Teacher perspectives on evaluation reform: Chicago’s REACH students. Educational Researcher, 44(2), 105-116.
Focusing the new design: The NAEP 1988 technical report
The 1988 NAEP surveyed American students' knowledge of reading, writing, civics, U.S. history, and geography.
Johnson, E. G., & Zwick, R. (1990). Focusing the new design: The NAEP 1988 technical report. Journal of Educational and Behavioral Studies, 17,95–109.
Estimating teacher impacts on student achievement: An experimental evaluation
This study used a random-assignment experiment in Los Angeles Unified School District to evaluate various non-experimental methods for estimating teacher effects on student test scores. Having estimated teacher effects during a pre-experimental period, the authors used these estimates to predict student achievement following random assignment of teachers to classrooms.
Kane, T. J., & Staiger, D. O. (2008). Estimating teacher impacts on student achievement: An experimental evaluation (No. w14607). National Bureau of Economic Research.
Gathering Feedback for Teaching: Combining High-Quality Observations with Student Surveys and Achievement Gains.
This report presents an in-depth discussion of the analytical methods and findings from the Measures of Effective Teaching (MET) project’s analysis of classroom observations.1 A nontechnical companion report describes implications for policymakers and practitioners.
Kane, T. J., & Staiger, D. O. (2012). Gathering Feedback for Teaching: Combining High-Quality Observations with Student Surveys and Achievement Gains. Research Paper. MET Project. Bill & Melinda Gates Foundation.
Identifying effective classroom practices using student achievement data
This paper combines information from classroom-based observations and measures of teachers' ability to improve student achievement as a step toward addressing these challenges. The results point to the promise of teacher evaluation systems that would use information from both classroom observations and student test scores to identify effective teachers.
Kane, T. J., Taylor, E. S., Tyler, J. H., & Wooten, A. L. (2011). Identifying effective classroom practices using student achievement data. Journal of human Resources, 46(3), 587-613.
Proceedings from the Wing Institute’s Fifth Annual Summit on Evidence-Based Education: Education at the Crossroads: The State of Teacher Preparation
This article shared information about the Wing Institute and demographics of the Summit participants. It introduced the Summit topic, sharing performance data on past efforts of school reform that focused on structural changes rather than teaching improvement. The conclusion is that the system has spent enormous resources with virtually no positive results. The focus needs to be on teaching improvement.
Keyworth, R., Detrich, R., & States, J. (2012). Introduction: Proceedings from the Wing Institute’s Fifth Annual Summit on Evidence-Based Education: Education at the Crossroads: The State of Teacher Preparation. In Education at the Crossroads: The State of Teacher Preparation (Vol. 2, pp. ix-xxx). Oakland, CA: The Wing
School-wide benchmarks of quality
Work sheet on the school-wide benchmarks of quality. Benchmark's include: faculty commitment, effective procedures for dealing with discipline, data entry and analysis plan establishment, expectation and rule development, and so on.
Kincaid, D., Childs, K., & George, H. (2005). School-wide benchmarks of quality. Unpublished instrument, University of South Florida.
Toward effective supervision: An operant analysis and comparison of managers at work, 1986
This study finds that performance monitoring is the factor that separated good mangers from ineffective managers.
Komaki, J. L. (1986). Toward effective supervision: An operant analysis and comparison of managers at work. Journal of Applied Psychology, 71(2), 270.
A measured approach: Value-added models are a promising improvement, but no one measure can evaluate teacher performance
The education policy community is abuzz with interest in value-added modeling as a way to estimate the effectiveness of schools and especially teachers. Value-added models provide useful information, but that information is error-prone and has a number of other important limitations.
Koretz, D. (2008). A measured approach. American Educator, 32(2), 18-39.
The impact of feedback frequency on learning and task performance: Challenging the “more is better” assumption.
This paper challenge the “more is better” assumption and propose that frequent feedback can overwhelm an individual’s cognitive resource capacity, thus reducing task effort and producing an inverted-U relationship with learning and performance over time.
Lam, C. F., DeRue, D. S., Karam, E. P., & Hollenbeck, J. R. (2011). The impact of feedback frequency on learning and task performance: Challenging the “more is better” assumption. Organizational Behavior and Human Decision Processes, 116(2), 217-228.
Examining the validity of ratings from a classroom observation instrument for use in a district’s teacher evaluation system
The purpose of this study was to examine the validity of teacher evaluation scores that are derived from an observation tool, adapted from Danielson's Framework for Teaching, designed to assess 22 teaching components from four teaching domains.
Lash, A., Tran, L., & Huang, M. (2016). Examining the Validity of Ratings from a Classroom Observation Instrument for Use in a District's Teacher Evaluation System. REL 2016-135. Regional Educational Laboratory West.
A National View of Certification of School Principals: Current and Future Trends
This paper focuses on two questions: (a) What patterns in certification currently exist across the states? and (b) What might these current patterns indicate for the future of school principal certification?
LeTendre, B. G., & Roberts, B. (2005). A National view of certification of school principals: Current and future trends. In University Council for Educational Administration, Convention, Nashville, TN. Retrieved October (Vol. 15, p. 2007).
Los Angeles teacher ratings
About 11,500 Los Angeles Unified elementary school teachers and 470 elementary schools are included in The Times' updated database of "value-added" ratings.
Los Angeles Times. (2021). Los Angeles teacher ratings.
The two purposes of teacher evaluation
Over one year, the author asked more than 3,000 educators their opinions about these two basic purposes by presenting them with a scale that has five values.
Marzano, R. J. (2012). Teacher Evaluation: What’s fair? What’s effective? The two purposes of teacher evaluation. Educational Leadership, 70(3), 14–19. Alexandria, VA: ASCD.
Effective supervision: Supporting the art and science of teaching
The authors show school and district-level administrators how to set the priorities and support the practices that will help all teachers become expert teachers. Their five-part framework is based on what research tells us about how expertise develops.
Marzano, R. J., Frontier, T., & Livingston, D. (2011). Effective supervision: Supporting the art and science of teaching. Ascd.
School leadership that works: From research to results
Building on the analysis that was first reported in School Leadership That Works, the authors of Balanced Leadership identify the 21 responsibilities associated with effective leadership and show how they relate to three overarching responsibilities:
Marzano, R. J., Waters, T., & McNulty, B. A. (2001). School leadership that works: From research to results. ASCD.
Alternative student growth measures for teacher evaluation: Implementation experiences of early-adopting districts
This study examines implementation of alternative student growth measures in a sample of eight school districts that were early adopters of the measures. It builds on an earlier Region al Educational Laboratory Mid-Atlantic report that described the two types of alternative student growth measures—alternative assessment–based value-added models and student learning objectives—in the early-adopting districts.
McCullough, M., English, B., Angus, M. H., & Gill, B. (2015). Alternative student growth measures for teacher evaluation: Implementation experiences of early-adopting districts (No. 8a9dfcb1bc6143608448114ea9b69d06). Mathematica Policy Research.
What is the purpose of teacher evaluation today? A conversation between Bellwether and Fordham.
In December 2016, Bellwether Education Partners and The Thomas B. Fordham Institute independently released two reports centered on teacher evaluation and its consequences. Both reports offer a glimpse into ongoing challenges and opportunities with teacher evaluation reform, but they have very different analyses.
McDougald, V., Griffith, D., Pennington, K., & Mead, S. (2016). What is the purpose of teacher evaluation today? A conversation between Bellwether and Fordham. Retrieved from https://edexcellence.net/articles/what-is-the-purpose-of-teacher-evaluation-today-a-conversation-between-bellwether-and
Validity research on teacher evaluation systems based on the framework for teaching.
This paper summarizes validity evidence pertaining to several different implementations of the Framework. It is based primarily on reviewing the published and unpublished studies that have looked at the relationship between teacher evaluation ratings made using systems based on the Framework and value-added measures of teacher effectiveness.
Milanowski, A. T. (2011). Validity Research on Teacher Evaluation Systems Based on the Framework for Teaching. Online Submission.
Validity research on teacher evaluation systems based on the framework for teaching.
This paper summarizes validity evidence pertaining to several different implementations of the Framework. It is based primarily on reviewing the published and unpublished studies that have looked at the relationship between teacher evaluation ratings made using systems based on the Framework and value-added measures of teacher effectiveness.
Milanowski, A. T. (2011). Validity Research on Teacher Evaluation Systems Based on the Framework for Teaching. Online Submission.
Status of the American Public School Teacher, 2000-2001.
This report presents the results of the 2000-01 Status of the American Public School Teacher survey. This survey has been conducted every 5 years since 1956
National Education Association. (2003). Status of the American public school teacher, 2000-2001. NEA Professional Library.
How large are teacher effects?
This research use data from a four-year experiment in which teachers and students were randomly assigned to classes to estimate teacher effects on student achievement.
Nye, B., Konstantopoulos, S., & Hedges, L. V. (2004). How large are teacher effects? Educational Evaluation and Policy Analysis, 26(3),237–257.
Meeting the highly qualifed teachers challenge: The secretary’s annual report on teacher quality.
Under the 1998 reauthorization of Title II of the Higher Education Act, the secretary of education is required to issue annual reports to Congress on the state of teacher quality nationwide. "Meeting the Highly Qualified Teachers Challenge" is the inaugural report on this important issue.
Paige, R. (2002). Meeting the Highly Qualified Teachers Challenge: The Secretary's Annual Report on Teacher Quality. US Department of Education.
For good measure? Teacher evaluation policy in the ESSA era.
As states and districts consider potential changes to their teacher evaluation systems and policies, this paper seeks to inform those efforts by reviewing the evolution of the teacher evaluation policy movement over the last several years, identifying positive outcomes of new systems and negative consequences, and describing risks that should be considered.
Pennington, K., & Mead, S. (2016). For good measure? Teacher evaluation policy in the ESSA era. Washington, DC: Bellwether Education Partners. Retrieved from https://bellwethereducation.org/publication/good-measure-teacher-evaluation-policy-essa-era
Teacher evaluation: A comprehensive guide to new directions and practices
This handbook advocates a new approach to teacher evaluation as a cooperative effort undertaken by a group of professionals.
Peterson, K. D. (2000). Teacher evaluation: A comprehensive guide to new directions and practices. Corwin Press.
Drive: The surprising truth about what motivates us
In this provocative and persuasive new book, the author asserts that the secret to high performance and satisfaction-at work, at school, and at home—is the deeply human need to direct our own lives, to learn and create new things, and to do better by ourselves and our world.
Pink, D. H. (2011). Drive: The surprising truth about what motivates us. Penguin.
Instructional alignment as a measure of teacher quality.
This article is the first to explore the extent to which teachers’ instructional alignment is associated with their contributions to student learning and their effectiveness on new composite evaluation measures using data from the Bill & Melinda Gates Foundation’s Measures of Effective Teaching study.
Polikoff, M. S, & Porter, A. C. (2014). Instructional alignment as a measure of teacher quality. Education Evaluation and Policy Analysis, 64(3), 212–225. Retrieved from http://www.aera.net/Newsroom/Recent-AERA-Research/Instructional-Alignment-as-a-Measure-of-Teaching-Quality
Teachers matter: Understanding teachers’ impact on student achievement,
Research using student scores on standardized tests confirms the common perception that some teachers are more effective than others. It also reveals that being taught by an effective teacher has important consequences for student achievement. The best way to assess a teacher's effectiveness is to look at his or her on-the-job performance.
RAND Education. (2012).Teachers matter: Understanding teachers’ impact on student achievement, Santa Monica, Calif.: Author. Retrieved from https://www.rand.org/pubs/corporate_pubs/CP693z1-2012-09.html
The impact of individual teachers on student achievement: Evidence from panel data
In order to provide accurate estimates of how much teachers affect the achievement of their students, this study used panel data covering over a decade of elementary student test scores and teacher assignment in two contiguous New Jersey school districts.
Rockoff, J. E. (2004). The impact of individual teachers on student achievement: Evidence from panel data. American economic review, 94(2), 247-252.
Teacher quality in educational production: Tracking, decay, and student achievement.
The author develop falsification tests for three widely used VAM specifications, based on the idea that future teachers cannot influence students' past achievement.
Rothstein, J. (2010). Teacher quality in educational production: Tracking, decay, and student achievement. The Quarterly Journal of Economics, 125(1), 175-214.
Teacher and classroom context effects on student achievement: Implications for teacher evaluation.
This study examined the relative magnitude of teacher effects on student achievement while simultaneously considering the in¯uences of intraclassroom heterogeneity, student achievement level, and class size on academic growth.
Sanders, W. L., Wright, S. P., & Horn, S. P. (1997). Teacher and classroom context effects on student achievement: Implications for teacher evaluation. Journal of Personnel Evaluation and Education, 11(1), 57–67.
Teacher evaluation: A conceptual framework and examples of country practices.
This paper proposes a conceptual framework to analyze teacher evaluation. It elaborates on the main components of a comprehensive teacher evaluation model and explains the main aspects to be taken into account for designing a teacher evaluation model.
Santiago, P., & Benavides, F. (2009). Teacher evaluation: A conceptual framework and examples of country practices.Organisation for Economic Cooperation and Development (OECD). Retrieved from http://www.oecd.org/education/school/44568106.pdf
Teacher evaluation: An issue overview.
Teacher evaluations matter a lot—both to teachers and to those holding them accountable. But how can schools measure the performance of all teachers fairly? And what should they do with the results?
Sawchuk, S. (2015). Teacher evaluation: An issue overview. Education Week, 35(3), 1-6.
Teacher Evaluation
Teacher evaluation can be a very sensitive topic for teachers and program administrators alike. Evaluations need to be fair and relevant to both teachers and programs.
Sayavedra, M. (2014). Teacher evaluation. ORTESOL Journal, 31, 1–9.
Teacher evaluation: Guide to professional practice.
This book is organized around four dominant interrelated core issues: professional standards, a guide to applying the Joint Committee's Standards, ten alternative models for the evaluation of teacher performance, and an analysis of these selected models.
Shinkfield, A. J., & Stufflebeam, D. L. (2012). Teacher evaluation: Guide to effective practice (Vol. 41). Springer Science & Business Media.
Barriers to the Preparation of Highly Qualified Teachers in Reading. TQ Research & Policy Brief.
This paper pointed out three prominent points of impact in addressing the poor performance of America’s fourth-graders on national examinations of reading proficiency.
Smartt, S. M., & Reschly, D. J. (2007). Barriers to the Preparation of Highly Qualified Teachers in Reading. TQ Research & Policy Brief. National Comprehensive Center for Teacher Quality.
Teacher pay for performance: Experimental evidence from the project on incentives in teaching
This paper presents the results of a rigorous experiment examining the impact of pay for performance on student achievement and instructional practice.
Springer, M. G., Ballou, D., Hamilton, L., Le, V. N., Lockwood, J. R., McCaffrey, D. F., ... & Stecher, B. M. (2011). Teacher Pay for Performance: Experimental Evidence from the Project on Incentives in Teaching (POINT). Society for Research on Educational Effectiveness.
Effective Teachers Make a Difference
This analysis examines the available research on effective teaching, how to impart these skills, and how to best transition teachers from pre-service to classroom with an emphasis on improving student achievement. It reviews current preparation practices and examine the research evidence on how well they are preparing teachers
States, J., Detrich, R. & Keywroth, R. (2012). Effective Teachers Make a Difference. In Education at the Crossroads: The State of Teacher Preparation (Vol. 2, pp. 1-46). Oakland, CA: The Wing Institute.
Improving teaching effectiveness: Final report: The intensive partnerships for effective teaching through 2015–2016
Bill & Melinda Gates Foundation launched the Intensive Partnerships for Effective Teaching initiative. The initiative's goal is dramatic gains in student achievement, graduation rates, and college-going, especially for LIM students.
Stecher, B. M., Garet, M. S., Hamilton, L. S., Steiner, E. D., Robyn, A., Poirier, J., ... & de los Reyes, I. B. (2016). Improving Teaching Effectiveness: Implementation: The Intensive Partnerships for Effective Teaching Through 2013–2014. Rand Corporation.
Incorporating student performance measures into teacher evaluation systems.
the authors examine how the five profiled systems are addressing assessment quality, evaluating teachers in nontested subjects and grades, and assigning teachers responsibility for particular students. The authors also examine what is and is not known about the quality of various student performance measures used by school systems.
Steele, J. L., Hamilton, L. S., & Stecher, B. M. (2010). Incorporating Student Performance Measures into Teacher Evaluation Systems. Technical Report. Rand Corporation.
Does teacher evaluation improve school performance? Experimental evidence from Chicago’s Excellence in Teaching project
Chicago Public Schools initiated the Excellence in Teaching Project, a teacher evaluation program designed to increase student learning by improving classroom instruction through structured principal–teacher dialogue.
Steinberg, M. P., & Sartain, L. (2015). Does teacher evaluation improve school performance? Experimental evidence from Chicago’s Excellence in Teaching project. Education Finance and Policy, 10(4), 535–572.
The effect of evaluation on teacher performance.
This paper offers evidence that evaluation can shift the teacher effectiveness distribution through a different mechanism: by improving teacher skill, effort, or both in ways that persist long-run.
Taylor, E. S., & Tyler, J. H. (2012). The effect of evaluation on teacher performance. American Economic Review, 102(7), 3628-51.
Can teacher evaluation improve teaching? Evidence of systematic growth in the effectiveness of mid-career teachers.
In the research reported here, the authors study one approach to teacher evaluation: practice-based assessment that relies on multiple, highly structured classroom observations conducted by experienced peer teachers and administrators.
Taylor, E. S., & Tyler, J. H. (2012a). Can teacher evaluation improve teaching? Evidence of systematic growth in the effectiveness of mid-career teachers. Education Next, 12(4), 79–84. Retrieved from http://educationnext.org/can-teacher-evaluation-improve-teaching/
Teacher Evaluation 2.0.
This report proposes six design standards that any rigorous and fair teacher evaluation system should meet. It offers a blueprint for better evaluations that can help every teacher succeed in the classroom—and give every student the best chance at success.
The New Teacher Project. (2010). Teacher Evaluation 2.0.New York, NY: Author. Retrieved from: https://tntp.org/assets/documents/Teacher-Evaluation-Oct10F.pdf
Rush to judgment: Teacher evaluation in public education
The authors examine the causes and consequences of the status of teacher evaluation and its implications for the current national debate about performance pay for teachers. The report also examines a number of national, state, and local evaluation systems that offer potential alternatives to current practice.
Toch, T., & Rothman, R. (2008). Rush to Judgment: Teacher Evaluation in Public Education. Education Sector Reports. Education Sector.
The Widget Effect: Our National Failure to Acknowledge and Act on Differences in Teacher Effectiveness.
This report examines the pervasive and longstanding failure to recognize and respond to variations in the effectiveness of teachers.
Weisberg, D., Sexton, S., Mulhern, J., Keeling, D., Schunck, J., Palcisco, A., & Morgan, K. (2009). The widget effect: Our national failure to acknowledge and act on differences in teacher effectiveness. New Teacher Project.
How schools matter: The link between teacher classroom practices and student academic performance
Quantitative studies of school effects have generally supported the notion that the problems of U.S. education lie outside of the school. Yet such studies neglect the primary venue through which students learn, the classroom. The current study explores the link between classroom practices and student academic performance by applying multilevel modeling to the 1996 National Assessment of Educational Progress in mathematics. The study finds that the effects of classroom practices, when added to those of other teacher characteristics, are comparable in size to those of student background, suggesting that teachers can contribute as much to student learning as the students themselves.
Wenglinsky, H. (2002). How schools matter: The link between teacher classroom practices and student academic performance. Education Policy Analysis Archives, 10(12).
Teacher Evaluation: A Study of Effective Practices
A preliminary survey of 32 school districts identified as having highly developed teacher evaluation systems was followed by the selection of 4 case study districts.
Reviewing the Evidence on How Teacher Professional Development Affects Student Achievement. Issues & Answers.
The purpose of this study is to examine research to answer the question, What is the impact of teacher professional development on student achievement.
Yoon, K. S., Duncan, T., Lee, S. W. Y., Scarloss, B., & Shapley, K. L. (2007). Reviewing the Evidence on How Teacher Professional Development Affects Student Achievement. Issues & Answers. REL 2007-No. 033. Regional Educational Laboratory Southwest (NJ1).
Measuring What Matters: A Stronger Accountability Model for Teacher Education
This report proposes an accountability system to regulate teacher preparation programs in essential areas: students are learning, classroom teaching skills, graduates commitment to the professional, graduates and employers feedback, and tests of teacher knowledge and skills.
Crowe, E. (2010). Measuring What Matters: A Stronger Accountability Model for Teacher Education. Online Submission.
2011 State Teacher Policy Yearbook: National Summary
This is a national analysis of each state’s performance against and progress toward a set of 36 specific, research-based teacher policy goals aimed at helping states build a comprehensive policy of teacher effectiveness.
Jacobs, S., Brody, S., Doherty, K, and Michele, K. (2011). 2011 State Teacher Policy Yearbook: National Summary. National Council on Teacher Quality.
Beyond effective supervision: Identifying key interactions between superior and subordinate
This paper examines the effects of supervision performance monitoring.
Komaki, J. L., & Citera, M. (1990). Beyond effective supervision: Identifying key interactions between superior and subordinate. The Leadership Quarterly, 1(2), 91-105.
Development of an operant-based taxonomy and observational index of supervisory behavior, 1986
This paper provides a taxonomy and observational instrument for seven categories of supervisory behavior.
Komaki, J. L., Zlotnick, S., & Jensen, M. (1986). Development of an operant-based taxonomy and observational index of supervisory behavior. Journal of Applied Psychology, 71(2), 260.
Implementing Data-Informed Decision Making in Schools-Teacher Access, Supports and Use
This paper documents education data systems and data-informed decision making in districts and schools. It examines implementation and the practices involving the use of data to improve instruction.
Means, B., Padilla, C., DeBarger, A., & Bakia, M. (2009). Implementing Data-Informed Decision Making in Schools: Teacher Access, Supports and Use. US Department of Education.