“Educators today face a dilemma. Should they support current presidential, legislative, and corporate initiatives that claim to ensure a quality education for all children through the escalation of standardized measurement of predetermined learning outcomes? Should they accommodate standardized testing within a contemporary learner-centered paradigm, which endorses a more eclectic “toolbox” approach to assessment that allows the informed educator to select among diverse gauges of learning progress” (Gallagher, 2003)?” The answer remains to be seen…
One of the most controversial aspects of American education in the 21st century is the widespread use of standardized tests within our public schools. Teachers and parents alike worry about the high-stakes nature of these assessments, which are increasingly used to sort students and evaluate teachers worldwide. Even many educators disagree with local testing policies, while others debate whether these exams are valid metrics to assess student learning. Understanding the results themselves can even be confusing: U.S. Secretary of Education Betsy DeVos made national headlines in 2017 when she fumbled through the difference between proficiency and growth during her confirmation hearings.
Long before No Child Left Behind, Race to the Top, and the ‘Every Student Succeeds Act’ were used in our daily lexicon, the American Psychological Association was asking powerful questions about the use of standardized testing in our schools. Specifically, the APA wanted to know why “American schools continue to rely on group-administered, standardized test scores for educational decision-making purposes” and how this “powerful historical tradition become a foundation for educational practices.” (American Psychological Association, 1993)?
Let us consider that last question: How did this powerful historical tradition become a foundation for educational practices?
Professors of education have longed used international benchmarks, such as the PISA, to compare countries, often to mixed results. As previously noted, many countries outperform the United States on these standardized metrics, most notably in Scandinavia and Asia. Recently, comparative education researchers have looked to these countries to better understand why these countries are so successful, especially in east Asia.
A contemporary belief many Americans hold about Asian countries is the high cultural values eastern countries attach to education. In reality, this belief is “an illusion at best and a cruel glorification of authoritarianism at worst” (Zhao, 2014). One author has noted that this perceived culture is actually, “a survival strategy the Chinese people developed to cope with thousands of years of authoritarian rule that has been glorified as China’s secret to educational success” (Zhao, 2014). It is important to know the rich history behind this phenomenon, leading back centuries to Confucius’s time and when the majority of Asia was under imperial rule:
“The first examinations were attributed to the Sui emperors (589-618 A. D.) in China. With its flexible writing system and extensive body of recorded knowledge, China was in a position much earlier than the West to develop written examinations. The examinations were built around candidates’ ability to memorize, comprehend, and interpret classical texts. Aspirants prepared for the examinations on their own in private schools run by scholars or through private tutorials. Some took examinations as early as age 15, while others continued their studies into their thirties. After passing a regional examination, successful applicants traveled to the capital city to take a 3-day examination, with answers evaluated by a special examining board appointed by the Emperor. Each time the examination was offered, a fixed number of aspirants were accepted into the imperial bureaucracy” (U.S. Congress, 1992).
This imperial exam system, known in Mandarin as 科舉 or keju, was originally viewed as an equitable way to ensure that all students had a chance to rise up from their current caste. From the perspective of those in power, “keju was a tool to identify and recruit the most capable and virtuous individuals into government instead of relying on members of the hereditary noble class” (Zhao, 2014). Perhaps most notably, and because of its perceived fairness, objectivity, and openness, “keju gave birth to the idea of meritocracy, a core value in many eastern countries” (Zhao, 2014).
Even hundreds of years later, Sun Yat-sen, the founding father of the Republic of China, often praised keju as the underpinnings of the world’s best education system. Dr. Young Zhao often refers to a fable that Sun often told about the drawbacks of a society without standardized tests. In the story, Sun talked about an election in the west between a doctor and a truck driver. “Of the two candidates, the doctor is certainly more knowledgeable that the driver, but he lost. This is the consequence of popular election without examination” (Zhao, 2014). When Sun Yat-sen set up a new government after overthrowing the Qing dynasty (the last imperial dynasty of China), the new constitution included an entire branch of government focused on standardized testing; this Examination Yuan continues in modern day Taiwan.
The rigorous, day-long written keju tests were quite different than what the academy offered elsewhere. In the Western world, for example, “examiners usually favored giving [oral] essays, a tradition stemming from the ancient Greeks’ affinity for the Socratic method” (Fletcher, 2009). These oral exams, which were typically held once a year and in public, “were more in the nature of public displays or exhibitions to show off brilliant pupils or to glorify teachers.” (Kandel, 1936, p. 24) These tests were often highly subjective, and by the mid-nineteenth century, “it was clear to [western] philosophers, scientists, and educators that the popular college tradition of oral qualifying examinations was flawed” (Gallagher, 2003).
It was during this time period that Horace Mann argued for widespread adoption of the common school, “a free, universal, non-sectarian, and public institution” (Warder, 2015). The father of public education, Mann was a revolutionary who saw schoolhouses as “the best means of achieving the moral and socioeconomic uplift of all Americans” (Warder, 2015). As such, Mann “persuaded the Boston Public School Committee to allow him to administer written exams to the city’s children in place of the traditional oral exams. Using a common exam, he hoped to provide objective information about the quality of teaching and learning in urban schools” (Gallagher, 2003). Similar to the Confucian tradition of keju, Mann thought that these common exams would be more equitable than the centuries-old tradition of oral exams. In doing so, “Mann’s goal was to find and replicate the best teaching methods so that all children could have equal opportunities” (Gershon, 2015).
Unfortunately, unlike Mann’s exam, “many of the first widely adopted standardized school tests were designed not to measure achievement but ability” (Gershon, 2015). Thus,
“as early as the mid-19th century, there existed a belief in the role of testing as a vehicle to classify students ex ante, commonly viewed as a necessary step in providing education. Also emerging during this period was an interest in uses of tests ex post: to monitor the effectiveness of schools in accomplishing their purposes. Visionaries like Mann saw testing as a means to educate effectively; administrators, legislators, and the general public turned to tests to see what children were actually learning. The fact that the first formal written examinations in the United States were intended as devices for sorting and classifying but were used also to monitor school effectiveness suggests how far back in American history one can go for evidence of test misuse” (U.S. Congress, 1992).
Written intelligence tests grew in prominence in the early twentieth century, and had an aura of scientific objectivity (Gershon, 2015). By the turn of the century, French psychologist Alfred Binet “began developing a standardized test of intelligence, work that would eventually be incorporated into a version of the modern IQ test, dubbed the Stanford-Binet Intelligence Test” (Fletcher, 2009). Less than ten years later, the U.S. government developed the Army Alpha and Beta test during World War I to “sort soldiers by their mental abilities, which soon became a model for schools” (Gershon, 2015).
Shortly thereafter, the College Entrance Examination Board started administrating exams in the 1920’s, which was later renamed as the Scholastic Aptitude Test, or the SAT. Similar to the goal of the Chinese keju system and Mann’s push for more objective common exams, “the SAT was designed partly to make top colleges into places for clever young [people] from all backgrounds, not just the children of the elite” (Gershon, 2015).
These early standardized tests were still somewhat subjective, however, as they were often short essays and almost always graded by hand. In 1936, IBM released the first rudimentary automatic test scanner, which allowed standardized tests to be graded faster than ever before. In 1959, “an education professor at the University of Iowa named Everett Franklin Lindquist (who later pioneered the first generation of optical scanners and the development of the GED test) developed the ACT as a competitor to the SAT” (Fletcher, 2009). And in 1965, “The Elementary and Secondary Education Act in particular opened the way for new and increased uses of norm-referenced tests to evaluate programs” (Alcocer, 2014), which was further exacerbated by the infamous No Child Left Behind Act of 2001.
A millennium after the Tang dynasty started the eastern practice of keju, centuries after Mann advocated for the use of common exams, and decades after the SAT tried to level the playing field for underprivileged children, standardized tests are just as controversial today as ever before. “Modern critics note that standardized test scores largely reflect socioeconomic privilege,” but it is unclear whether those differences are due to the inequities amongst schools or the tests themselves (Gershon, 2015). In fact, “tests don’t necessarily create more social stratification. Instead, they mostly seem to reflect the academic advantages that go with socioeconomic privilege among American kids. But, of course, that’s evidence that despite Horace Mann’s hopes for standardized tests, equal opportunity for all children still hasn’t become reality” (Grodsky et. Al., 2008)
In other words, what was originally thought of as an innovative way to increase equity has actually made our system more inequitable. Even in Taiwan, a country where the OECD reports as having one of the most equitable public education systems in the world, the practice of after school bǔxíbān (cram schools) are quite pervasive. Although many schools in Taiwan, South Korea, and Japan are remarkably equitable, these night classes are one way that parents with means use their resources to give an unfair advantage to their children, at often an incredible financial and emotional cost.
Compared to the United States, many countries have a more effective and efficient method to assess their students. In Sweden, for example, “standardized examinations are used as scoring benchmarks to help teachers grade students uniformly and properly in their regular classes” (U.S. Congress, 1992). In Taiwan, all students take a national two-day exam at the end of junior high school, which is administered by the Ministry of Education. Students interested in looking to attend university then have to take two tests at the end of high school, specifically the Subject Competency Test and the Designated Subjects Examination. And contrary to popular belief, even students in Finland take a standardized test, called the “National Matriculation Exam, which everyone takes at the end of a voluntary upper-secondary school, roughly the equivalent of American high school.” (Partanen, 2011). In many of the best education systems around the world, Finland included, teachers focus more on formative assessments, which an abundance of research has shown to have incredibly positive effects in the classroom (Black & William, 2010).
Another major problem with standardized testing in the United States is that private companies with heavy financial incentives are often the entities that administer these tests. These large companies also have significant lobbying power and have tremendously affected domestic education policy. As it has been said, “only in the United States is there a strong commercial test development and publishing market. The importance of this sector, in terms of research, development, and influence on the quality and quantity of testing, cannot be overstated. Even when States and districts create their own tests, they often contract with private companies. In Europe and Asia, testing policies reside in miniseries of education” (U.S. Congress, 1992). This should be a major wake-up call for all parents, educators, and policy-makers alike.
Looking forward, more colleges than ever are participating in the FairTest movement, which encourages universities to consider allowing students to apply without submitting any standardized test scores. Some critics point to the fact that “while our understanding of the brain and how people learn and think has progressed enormously, standardized tests have remained the same” (Fairtest, 2012). In other places, many parents have opted their students out of taking high-stakes common core exams, such as the PARCC exam or ‘Smarter Balanced Assessment.’
Although we are centuries removed from the keju, the United States still uses remnants of the imperial exam system today. Every major professional field, from accountants to teachers to the foreign diplomatic corps, requires some sort of standardized test to become licensed in their field. This ideology in deeply ingrained: would you want a doctor to examine you or a lawyer to represent you without passing their qualifying exams? It remains to be seen how standardized tests will impact our future, but it is important to understand their history if we are serious about engaging in a policy debate over how to best serve our youth moving forward.
If you are interested in learning more about the history of Testing in American Schools, feel free to read the OTA report on the subject matter here.
Alcocer, P. (2014). NEA Education Policy and Practice. History of Standardized Testing in the United States. Retrieved February 01, 2018, from http://www.nea.org/home/66139.htm#1958-present
American Psychological Association (1993). Learner-Centered Psychological Principles: Guidelines for School Redesign and Reform, American Psychological Association, Washington, DC. Google Scholar
Black, P., & Wiliam, D. (2010). Inside the Black Box: Raising Standards through Classroom Assessment. Phi Delta Kappan, 92(1), 81-90. doi:10.1177/003172171009200119
Fairtest.org (2012). What’s Wrong With Standardized Tests? Retrieved February 15, 2018, from http://www.fairtest.org/facts/whatwron.htm
Fletcher, D. (2009, December 11). Standardized Testing. Retrieved February 4, 2018, from http://content.time.com/time/nation/article/0,8599,1947019,00.html
Gallagher, C. (2003). Reconciling a Tradition of Testing with a New Learning Paradigm. Educational Psychology Review, 15(1), 83-99. Retrieved from http://www.jstor.org/stable/23361535
Gershon, L. (2015, May 12). A Short History of Standardized Tests. Retrieved February 1, 2018, from https://daily.jstor.org/short-history-standardized-tests/
Grodsky, E., Warren, J., & Felts, E. (2008). Testing and Social Stratification in American Education. Annual Review of Sociology, 34, 385-404. Retrieved from http://www.jstor.org/stable/29737796
Kandel, I. L. (1936). Examinations and their substitutes in the United States. New York: Carnegie Foundation for the Advancement of Teaching. Google Scholar
Partanen, A. (2011, December 29). What Americans Keep Ignoring About Finland’s School Success. Retrieved February 1, 2018, from https://www.theatlantic.com/national/archive/2011/12/what-americans-keep-ignoring-about-finlands-school-success/250564/
U.S. Congress, Office of Technology Assessment (1992). Testing in American Schools: Asking the Right Questions, OTA-SET-519. Washington, DC: U.S. Government Printing Office
Warder, G. (2015). Horace Mann and the creation of the Common School. Retrieved (February 4, 2018) from http://www.disabilitymuseum.org/dhm/edu/essay.html?id=42.
Zhao, Y. (2014). Who’s Afraid of the Big Bad Dragon?: Why China has the Best (and Worst) Education System in the World. San Francisco, CA: Jossey-Bass.