Have mathematics scores improved over 40-plus years of school reform?

Why is this question important? In 1981, the United States formed the National Commission on Excellence in Education to assess the performance of the nation’s K–12 education system. The resulting report, A Nation at Risk (Gardner et al., 1983), described a system that was failing on virtually all levels: rapidly declining test scores, an “incoherent, outdated patchwork quilt” of classroom learning, poorly trained teachers, etc. It catapulted the issue of education onto the national agenda, triggering a series of reform initiatives that invested enormous resources into school reform over the next 30 years (e.g. America 2000, No Child Left Behind, Race to the Top). The critical question is, have we made any progress?

An essential element of school reform is having meaningful and accurate feedback data for evaluating education performance over time. Otherwise, it is impossible to make sound decisions about school reform initiatives. One such performance indicator is the National Assessment of Educational Progress (NAEP) long-term trend assessment, which provides the most extensive retrospective picture of student achievement in the United States. This assessment has been testing students in reading since 1971, and mathematics since 1973. It provides a benchmark for long-term evaluation of our progress in school reform.

See further discussion below.

NAEP mathematics scores, long-term trend assessment, 1971–2012. (National Center for Education Statistics [NCES], 2011f). *Test formats were changed in 2004, and both old and new formats were reported for that year. The new format was used in 2008 and 2012.

Results: Scale scores provide a numeric summary of what groups of students know and can do in a particular subject. NAEP mathematics scale scores range from 0 to 500. The data on student mathematics performance from 1971 through 2012 show a remarkable lack of student progress, despite numerous and significant school reform initiatives (A Nation at Risk, Goals 2000, NCLB). Students at ages 9 and 13 showed small improvements over the 40-plus years (25 and 19 percentage points respectively), and 17-year-olds made virtually no improvement. Additionally, since 2008, only 13-year-olds made any improvement; 9-year olds and 17-year-olds made no progress.

Implications: The value of a nation’s education system is measured by how well it serves all of its children, not just those fortunate enough to attend a model school or live in a high-performing school district. While the NAEP long-term trend assessment test is a macro-measure aggregating the performance of all students by age range, data from these tests provide a clear and unambiguous picture of how poorly the U.S. education system is educating students on selected content measures. The implications of these data must also be discussed at the macro level. The aggregate of all of the resources, initiatives, policies, and programs the United States has implemented for K–12 school reform has not had a corresponding impact on improving student performance in mathematics. While 9-year-old and 13-year-old students demonstrated slight improvement over this time period, 17-year old students made no improvement. This situation is alarming, since the students who made no improvement are the ones who were graduating.

Study Description: NAEP has often been called the gold standard for standardized academic testing because of its constant rigorous scrutiny (Gorman, 2010). Established in 1964, with the first tests administered in 1969, NAEP provides a continuing assessment of what American students know and can do in math, reading, science, writing, the arts, civics, economics, geography, and U.S. history. NAEP is administered by the National Center for Education Statistics (NCES), a division of the Institute of Education Sciences in the U.S. Department of Education. Panels of technical experts within NCES and other organizations continually scrutinize tests for reliability and validity, keeping them similar from year to year and documenting changes. NAEP is one of the few common metrics for all states, providing a picture of student academic progress over time.

The richest set of student achievement data come from NAEP, which provides data on subject matter achievement in two ways: scale scores (long-term trend assessment) and achievement levels (main NAEP assessment). The long-term trend assessment makes available test data in mathematics and reading going back to 1970, with test scores by age (9, 13, and 17). It is completed every 4 years. The main NAEP assessment reports test results on 12 different subject areas going back to 1992, with student data by grade (4, 8, and 12). It is completed every 2 years. For both assessments, probability samples of schools and students are selected to represent the diverse student population in the United States.

Citation:
Gardner, D. P., Larsen, Y. W., Baker, W. O., Campbell, A., Crosby, E. A., Foster, C. A., Jr., ...Wallace, R. (1983). A nation at risk: The imperative for educational reform. An open letter to the American people. A report to the nation and the secretary of education. Retrieved from http://www.eric.ed.gov/ERICWebPortal/detail?accno=ED226006.

Gorman, S. (2010). An introduction to NAEP. (NCES 2010-468). Washington, DC: National Center for Education Statistics. Retrieved from http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2010468

National Center for Education Statistics (NCES). (2011). Data Explorer for long-term trend. [Data file]. Retrieved from http://nces.ed.gov/nationsreportcard/lttdata/.

National Center for Education Statistics (NCES). (2013). The nation’s report card: Trends in academic progress 2012 (NCES 2013-456). Washington, DC: Institute of Education Sciences, U.S. Department of Education. Retrieved from http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2013456.