Just Whose Idea was all this testing?
By Jay Mathews
Washington Post Staff Writer
Tuesday, November 14, 2006; A06
The Washington Post:
In ancient Greece, Socrates tested his students through conversations. Answers were not scored as right or wrong. They just led to more dialogue. Many intellectual elites in the 5th and 4th centuries B.C. cared more about finding the path to higher knowledge than producing a correct response. To them, accuracy was for shopkeepers.
Today, educators often hold up the Socratic method as the best kind of teaching.
So how did we go from that ideal to an educational model shaped -- and perhaps even ruled -- by standardized, normed, charted, graphed, regressed, calibrated and validated testing? Students in the Washington area are likely to know more about the MSA (Maryland School Assessments), the SOL (Virginia's Standards of Learning) and the D.C. CAS (D.C. Comprehensive Assessment System) than they do about Socrates and his illustrious student Plato.
Critics say standardized testing has robbed schools of the creative clash of intellects that make Plato's dialogues still absorbing. "There is a growing technology of testing that permits us now to do in nanoseconds things that we shouldn't be doing at all," said educational psychologist Gerald W. Bracey, research columnist for the Phi Delta Kappan education journal.
Historians call the rise of testing an inevitable outgrowth of expanding technology. As goods and services are delivered with greater speed and in higher quantity and quality, education has been forced to pick up the pace.
Standardized exams have many sources. In imperial China in the A.D. 7th century, government job applicants had to write essays about Confucian philosophy and compose poetry. In Europe, the invention of the printing press and modern paper manufacturing fueled the growth of written exams.
By 1845 in the United States, public education advocate Horace Mann was calling for standardized essay testing. Spelling tests, geography tests and math tests blossomed in schools, although they were rarely standardized.
At the outset of the 20th century, educators began to experiment with tests that took shortcuts around the old essay methods. French psychologist Alfred Binet developed an intelligence test about 1905. Frederick J. Kelly of the University of Kansas designed a multiple-choice test in 1914. Scanning machines followed. Many Americans accepted these tests as efficient tools to help build a society based on merit, not birth or race or wealth.
Still, modern testing had a clumsy start as psychologists experimented with exams to help employers, schools and others rate applicants. In one early case, testing expert H.H. Goddard identified as "feeble-minded" 83 percent of Jews, 80 percent of Hungarians, 79 percent of Italians and 87 percent of Russians among a small group of immigrants assessed at Ellis Island.
"Consider a group of frightened men and women who speak no English and who have just endured an oceanic voyage in steerage," Harvard University science historian Stephen Jay Gould wrote of the Goddard study. "Most are poor and have never gone to school; many have never held a pencil or pen in their hand." Yet Goddard's interviewers expected them to sit down with a pencil and "reproduce on paper a figure shown to them a moment ago, but now withdrawn from their sight."
Eventually, testing experts focused on standardizing the measure of learning, not of innate intelligence.
The College Entrance Examination Board, founded in 1900, played a huge role. Now called the College Board, it "created the best, most consistent and most influential standards that American education has ever known," New York University educational historian Diane Ravitch wrote in March in the Chronicle of Higher Education.
The board's early exams were written and graded by teachers and professors and had no multiple-choice questions. These essay exams, Ravitch wrote, led "everyone who went to high school, whether they were the children of doctors or farmers or factory workers . . . to study mathematics, science, English literature, composition, history and a foreign language, usually Latin."
Many educators who value depth and rigor lament what followed. In 1926, the multiple-choice SAT was introduced as a much faster way of testing college applicants. On Dec. 7, 1941, several members of the board, during a previously scheduled lunch, decided that the outbreak of world war would require faster decisions and less leisurely testing. They eventually canceled the board's old exam format. The SAT ruled.
Essay questions, however, made a comeback in 1955 when Advanced Placement exams began.
The launch of Sputnik, the Soviet space satellite, in 1957 fueled a space race and increased pressure on U.S. schools to show improvement. But rating schools through tests did not advance much until the mid-1970s, when the College Board revealed that average SAT scores had been falling since 1963. Then, in 1983, a national commission declared in the report "A Nation at Risk" that public school standards were too low. Over the next two decades, testing took off.
In the 1980s and early 1990s, several governors argued that they had to test all their students to raise school standards and improve their economies. Among them were Democrats Bill Clinton of Arkansas and Richard W. Riley of South Carolina, who would soon become president and U.S. education secretary, respectively. (Later in the 1990s, Republican Gov. George W. Bush of Texas also was a big proponent of testing.)
Some educators said a better way to improve schools was to spend more on teacher training, salaries and smaller classes. They dwelled on educational inputs; the politicians, on outputs.
The politicians prevailed. In 1988, Congress created the National Assessment Governing Board. It established new standards for the National Assessment of Educational Progress, a test that has been given to a sampling of students since 1970. In 2002, President Bush signed the No Child Left Behind law. For the first time, it required annual testing of all public school children in certain grades and required states to use results to help rate schools.
The National Education Association and other teacher organizations argue that it is unfair to rate schools through such tests when teachers lack adequate training and pay. In a 2004 essay for the Hoover Digest, Ravitch wrote that the advocates of inputs and the champions of outputs "are in constant tension, with first one and then the other gaining brief advantage."
"How this conflict is resolved," she wrote, "will determine the future of American education."