Education Drivers

Standardized Tests

A standardized achievement test is administered and scored using equivalent tests under the same conditions, permitting equal time for completion, and using the same scoring criteria. By controlling for significant variables such as testing conditions, the results can be compared across schools, districts, or states. There are two primary types of standardized achievement tests: norm referenced and criterion referenced. Norm-referenced tests rank individual student achievement compared with a statistically representative sample of peers, allowing educators to discriminate between high and low achievers. Criteria-referenced tests measure student performance against a fixed set of predetermined criteria or standards. Because criterion-referenced tests evaluate whether students have acquired a specific body of knowledge or skills, scores are frequently used as indicators of the effectiveness of instruction. When results are aggregated, criterion-referenced tests can help decision makers determine what is working and where adjustments can be made to improve instruction from one year to the next. Standardized achievement tests are referred to as high-stakes tests when consequences are attached to the results. Consequences range from affirming student promotion to assigning teacher compensation to determining funding for schools. Despite high expectations for these tests, research on high-stakes test accountability and achievement finds only a small effect size for improving student achievement.

Publications

TITLE
SYNOPSIS
CITATION
Seeking the Magic Metric: Using Evidence to Identify and Track School System Progress

This paper discusses the search for a “magic metric” in education: an index/number that would be generally accepted as the most efficient descriptor of school’s performance in a district.

Celio, M. B. (2013). Seeking the Magic Metric: Using Evidence to Identify and Track School System Quality. In Performance Feedback: Using Data to Improve Educator Performance (Vol. 3, pp. 97-118). Oakland, CA: The Wing Institute.

 

Data Mining

TITLE
SYNOPSIS
CITATION
Would a student rated 'Proficient' in Reading in one state be rated 'Proficient' in Reading in another state?
The inquiry compares student performance between state proficiency standards and the National Assessment Education Progress proficiency standards.
Gibson, S. (2009). Would a student rated 'Proficient' in Reading in one state be rated 'Proficient' in Reading in another state? Retrieved from would-student-rated-'proficient.
What are the costs and benefits of five common educational interventions?
This analysis examined the cost effectiveness of research from Stuart Yeh on common sturctural interventions in education.
States, J. (2010). What are the costs and benefits of five common educational interventions? Retrieved from what-are-costs-and.

 

Presentations

TITLE
SYNOPSIS
CITATION
Seeking the Magic Metric: Using Evidence to Identify and Track School System Progress

This paper discusses the search for a “magic metric” in education: an index/number that would be generally accepted as the most efficient descriptor of school’s performance in a district.

Celio, MB. (2011). Seeking the Magic Metric: Using Evidence to Identify and Track School System Progress [Powerpoint Slides]. Retrieved from 2011-wing-presentation-mary-beth-celio.

TITLE
SYNOPSIS
CITATION
High-Stakes Testing, Uncertainty, and Student Learning

This study evaluated the relationship between scores on high stakes test and scores on other measures of learning such as NAEP and SAT scores.  In general,  there was no increase in student learning as a function of high stakes testing.

Amrein, A. L., & Berliner, D. C. (2002). High-Stakes Testing, Uncertainty, and Student Learning. Education Policy Analysis Archives.

The Impact of High-Stakes Tests on Student Academic Performance

The purpose of this study is to assess whether academic achievement in fact increases after the introduction of high-stakes tests. The first objective of this study is to assess whether academic achievement has improved since the introduction of high-stakes testing policies in the 27 states with the highest stakes written into their grade 1-8 testing policies.

Amrein-Beardsley, A., & Berliner, D. C. (2002). The Impact of High-Stakes Tests on Student Academic Performance.

High-stakes testing, uncertainty, and student learning

A brief history of high-stakes testing is followed by an analysis of eighteen states with severe consequences attached to their testing programs.

Beardsley, A., & Berliner, D. C. (2002). High-stakes testing, uncertainty, and student learning. Education Policy Analysis Archives, 10.

Reconsidering the impact of high-stakes testing.

This article is an extended reanalysis of high-stakes testing on achievement. The paper focuses on the performance of states, over the period 1992 to 2000, on the NAEP mathematics assessments for grades 4 and 8.

Braun, H. (2004). Reconsidering the impact of high-stakes testing. Education Policy Analysis Archives, 12(1).

Does external accountability affect student outcomes? A cross-state analysis.

This study developed a zero-to-five index of the strength of accountability in 50 states based on the use of high-stakes testing to sanction and reward schools, and analyzed whether that index is related to student gains on the NAEP mathematics test in 1996–2000.

Carnoy, M., & Loeb, S. (2002). Does external accountability affect student outcomes? A cross-state analysis. Educational Evaluation and Policy Analysis, 24(4), 305-331.

BURIED TREASURE: Developing a Management Guide From Mountains of School Data

This report provides a practical “management guide,” for an evidence-based key indicator data decision system for school districts and schools.

Celio, M. B., & Harvey, J. (2005). Buried Treasure: Developing A Management Guide From Mountains of School Data. Center on Reinventing Public Education.

Validity of High-School Grades in Predicting Student Success beyond the Freshman Year: High-School Record vs. Standardized Tests as Indicators of Four-Year College Outcomes

High-school grades are often viewed as an unreliable criterion for college admissions, owing to differences in grading standards across high schools, while standardized tests are seen as methodologically rigorous, providing a more uniform and valid yardstick for assessing student ability and achievement. The present study challenges that conventional view. The study finds that high-school grade point average (HSGPA) is consistently the best predictor not only of freshman grades in college, the outcome indicator most often employed in predictive-validity studies, but of four-year college outcomes as well.

Geiser, S., & Santelices, M. V. (2007). Validity of High-School Grades in Predicting Student Success beyond the Freshman Year: High-School Record vs. Standardized Tests as Indicators of Four-Year College Outcomes. Research & Occasional Paper Series: CSHE. 6.07. Center for studies in higher education.

Testing High Stakes Tests: Can We Believe the Results of Accountability Tests?

This study examines whether the results of standardized tests are distorted when rewards and sanctions are attached to them.

Greene, J., Winters, M., & Forster, G. (2004). Testing high-stakes tests: Can we believe the results of accountability tests?. The Teachers College Record, 106(6), 1124-1144.

High stakes: Testing for tracking, promotion, and graduation

The report considers the appropriate uses and misuses of high stakes tests in making decisions for students.  The fundamental question is whether test scores lead to consequences that are educationally beneficial.

Heubert, J. P., & Hauser, R. M. (1998). High stakes: Testing for tracking, promotion, and graduation. Retrieved from http://files.eric.ed.gov/fulltext/ED439151.pdf

The effectiveness of the SAT in predicting success early and late in college: A meta-analysis

This meta-analysis examines issues of reliability and validity of SAT tests and student grades on student performance in college.

Hezlett, S., Kuncel, N., Vey, A., Ones, D., Campbell, J. & Camara, W. (2001). “The effectiveness of the SAT in predictive success early and late in college: A comprehensive meta-analysis.” Paper presented at the annual meeting of the National Council of Measurement in Education, Seattle, WA.

Accountability, incentives and behavior: The impact of high-stakes testing in the Chicago Public Schools

This study evaluated the effects of high stakes testing on the achievement levels of students in Chicago Public Schools.  The data suggests that even though scores went up on the high stakes tests scores on “low stakes” achievement tests did not improve.  This suggests increases in scores was a function increases in test-specific skills rather than a general improvement in student learning.  These findings give credence to the “teaching to the test” criticisms.

Jacob, B. A. (2005). Accountability, incentives and behavior: The impact of high-stakes testing in the Chicago Public Schools. Journal of public Economics. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.401.6599&rep=rep1&type=pdf

The Effects of High-Stakes Testing on Achievement: Preliminary Findings about Generalization across Tests.

This study evaluated the generalization from high stakes tests to other mesures of achievement.  The results suggest that there is little generalization suggesting that improvement in high stakes test scores are the result of emphasis placed on the tests and time spent in test preparation rather than actual increase in student learning.

Koretz, D. M. (1991). The Effects of High-Stakes Testing on Achievement: Preliminary Findings about Generalization across Tests. ERIC. Retrieved from http://files.eric.ed.gov/fulltext/ED340730.pdf

The Adverse Impact of High Stakes Testing on Minority Students: Evidence from 100 Years of Test Data.

This paper is an examination of the impact of high stakes testing on minority students.  The outcomes suggest that high stakes testing does not have a positive impact on minority students and in some instances there is negative effects from high stakes testing.

Madaus, G. F., & Clarke, M. (2001). The Adverse Impact of High Stakes Testing on Minority Students: Evidence from 100 Years of Test Data. ERIC. Retrieved from http://files.eric.ed.gov/fulltext/ED450183.pdf

Is Performance on the SAT Related to College Retention?

This study examines the relationship between scores on the SAT and retention to second year of college using student level data from the freshman class of 2006 at 106 four-year institutions.

Mattern, K. D., & Patterson, B. F. (2009). Is performance on the SAT related to college retention?.

A guide to standardized testing: The nature of assessment

The goal of this guide is to provide useful information about standardized testing, or assessment, for practitioners and non-practitioners who care about public schools. It includes the nature of assessment, types of assessments and tests, and definitions.

Mitchell, R. (2006). A guide to standardized testing: The nature of assessment. Center for Public Education. 

Analysis of the predictive validity of the SAT and high school grades from 1976 to 1983

This study examines validity data for SAT scores and student grades enrolling classes of 1976 to 1985.

Morgan, R. (1989). “Analysis of the predictive validity of the SAT and high school grades from 1976 to 1983.” College Board Report No. 89-7. New York: College Board.

High Stakes: Testing for Tracking, Promotion, and Graduation

This book looks at how testing affects critical decisions for American students. The text focuses on how testing is used in schools to make decisions about tracking and placement, promotion and retention, and awarding or withholding high school diplomas. This book examines the controversies that emerge when a test score can open or close gates on a student's educational pathway.

National Research Council. (1999). High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: National Academies Press.

The Inevitable Corruption of Indicators and Educators through High-Stakes Testing.

The paper examines Campbell’s law-the more any quantitative social indicator is used for social decision making the more likely the measure will corrupt the social processes it is intended to monitor.”  In education, high stakes testing has resulted in widespread cheating, exclusion from low performing students from testing, encouraging students to drop out,  and narrowing the curriculum.

Nichols, S. L., & Berliner, D. C. (2005). The Inevitable Corruption of Indicators and Educators through High-Stakes Testing. Education Policy Research Unit. Retrieved from http://files.eric.ed.gov/fulltext/ED508483.pdf

Can high stakes testing leverage educational improvement? Prospects from the last decade of testing and accountability reform.

This paper examines the use of high stakes testing such as end of course exams in American education.  The conclusions are that the exams do not produce substantive changes in instructional practices and the information is useful to measure school and system progress but has limited utility for instructional guidance.

Supovitz, J. (2009). Can high stakes testing leverage educational improvement? Prospects from the last decade of testing and accountability reform. Journal of Educational Change, 10(2-3), 211-227.

Comparing Alternatives in the Prediction of College Success

This study investigates the prediction of college success as defined by a student’s college GPA. We predict college GPA mid-way through and at the end of their college careers using high school GPA (HSGPA), college entrance exam scores (SAT/ACT) and an open-ended, performance-based assessment of critical thinking and writing skills (CLA). 3,137 college sophomores and 1,330 college seniors participated in this study.

Zahner, D., Ramsaran, L. M., & Steedle, J. T. (2012). Comparing alternatives in the prediction of college success. In Annual Meeting of the American Educational Research Association, Vancouver, Canada.

No items found.

Back to Top