by Brian Dunning

December 7, 2010
Don't you just love the idea that your level of intelligence can be boiled down to a single number, and ranked along with those of all the other dummies in the world? You may have taken an IQ test in the past, and may even know your score. It's an unfortunate fact of statistics that half the people walking around are below average intelligence — the way the tests are scored assures that 100 is both the median and the average — and sometimes we question the value of force-ranking ourselves, and assigning so much cultural significance and stigma to it, based on one narrow metric. IQ tests look like the ideal place to point Skeptoid's skeptical eye.

There are a number of obvious apparent criticisms of the idea of ranking everyone with a single number that purports to encompass how intelligent they are. Some people are "book smart" but with no "common sense", and some are the opposite. Some people have high or low creativity or humor, but may ace all their tests in school or fail them. Each of us is complex, with many strengths and weaknesses, aptitudes and preferences, and it seems that any one number purporting to quantify our intelligence must be grossly misleading in every case.

There are even obvious criticisms of the tests themselves. There are a number of different IQ tests in use, and it's well established that the same people will score differently on the various tests: I might get a higher score than you on one test, while you outscore me on another. Critics often point out that any IQ test is necessarily skewed toward a particular cultural frame of reference, making it unfair to measure someone from Africa using a test developed in Denmark (for example).

These basic criticisms are answered by a closer study of what IQ tests actually purport to measure. They've got nothing to do with "book smarts" and are intended to have no cultural relevance. The tests measure only your intelligence. There are as many different definitions of intelligence as there are psychologists, but we can extract some common themes from the definitions offered by those who have played the biggest roles in developing these tests. Generally speaking, your intelligence is your problem solving and reasoning ability. It encompasses learning, planning, and understanding.

IQ testing has an ominous history. It originally grew out of the eugenics movement in the United States around the turn of the twentieth century. The basic idea of eugenics was to identify desirable traits, such as intelligence, health, and even financial success, and to increase birth rates among such people. At the same time, birth rates among people with negative traits such as lower intelligence, criminal behavior, poverty, and illness, would be discouraged. When it was discovered that heredity played a large role in some mental illnesses, forced sterilization was imposed upon mental patients in some states in an effort to breed such traits out of the population. According to most counts, some 64,000 mentally ill Americans were sterilized until the practice was finally terminated in the 1960s. In the Nuremberg Trials, it was revealed that the Nazis considered the American program so effective that it was the inspiration for the Nazis' forced sterilization of some 450,000 people.

The father of eugenics was the Englishman Sir Francis Galton, a cousin of Charles Darwin. Over the course of Galton's varied and productive career, he not only codified the science of eugenics but also pioneered psychometry as a tool for measuring people's intelligence and determining whether it would be best for them to breed or not. Galton coined the phrase nature versus nurture and identified the trend of regression towards the mean, though his original term for this was reversion towards mediocrity. So long as unintelligent people were allowed to reproduce freely, mankind could never rise above its native mediocrity.

A tool for quantitatively identifying mental retardation was needed by American eugenicists, and so they turned to two French researchers, Alfred Binet and Théodore Simon, who had developed the Binet-Simon test as a way of identifying French schoolchildren who needed special assistance. Binet-Simon did not ask questions about general knowledge, instead it imposed a diverse system of tasks, from simple physical tests to memory puzzles. The resulting score was expressed as the mental age.

Lewis Ternan, a psychologist from Stanford University, translated and improved the test in 1910, and it became known as the Stanford-Binet. The result was your Intelligence Quotient, a quotient of your mental age divided by your chronological age. If you were 10 years old but had the reasoning ability of a 15-year-old, your IQ was 150. For the first time, eugenicists had a tool that could spell out, in black and white, a person's value to society.

World War I saw widespread adoption of intelligence testing by the United States Army. The intent was that the most intelligent recruits would be sent to officer training, the least intelligent would be rejected from service, and those in the middle assigned to technical, combat, or other duties according to their scores. But the process didn't go as smoothly as its proponents hoped. Different testing methodologies were tried, there were inadequate resources for testing such large numbers of men, and many of the results were controversial.

What arose from this was a thorough revision of the scoring, developed by David Wechsler, the chief psychologist at Bellevue Psychiatric Hospital. As a young man he'd worked with the Army during its troubled attempt at implementing intelligence testing. His innovation was to grade the tests on a curve, with your score representing your placement within the distribution of all the aggregated scores. This is now the universal standard. The scoring is designed in such a way that graphing all the scores of a given population will result in a perfect bell curve. The intent is for the peak of the curve to hit exactly at a score of 100 (which should represent about 2.7% of the population), with the long tails of the curve petering out at about 50 and 150. For those of a statistical mindset, the distribution is intended to have a standard deviation of 15. Whenever the tests are revised (we're now using Stanford-Binet 5), the scoring system is reset so that the average is again 100 and the standard deviation is again 15. We still call it the IQ, even though it's no longer a quotient.

Well, we're not practicing institutionalized eugenics anymore, and IQ scores no longer restrict where we can go and what job we can have, so is all the controversy gone from IQ testing? Not hardly. It was gone, for the most part, until the 1994 publication of The Bell Curve: Intelligence and Class Structure in American Life, a book by Harvard experimental psychologist Richard Herrnstein and conservative political scientist Charles Murray. The controversy came raging back with a vengeance. The Bell Curve's central thesis pointed out many inconvenient and politically incorrect sociopolitical implications of IQ scores.

The nice way of summarizing it is that intelligence is the strongest predictor of factors such as professional success, criminal activity, and divorce rates, and thus correlates strongly with various sociopolitical and ethnic groups across the country. The harsh way of summarizing its most controversial chapters is that blacks are less intelligent than whites. This finding triggered a tsunami of academic and popular criticism that publisher Free Press rode all the way to the bank, and that kept The Bell Curve squarely on the best-seller list.

The most troubling finding by the authors was that intelligence appeared to be the result of a combination of both nature and nurture. In simplified terms, this means that race plays at least some role in determining intelligence. The criticism of this claim came from many different directions: That Herrnstein and Murray had used flawed weighting in their statistical measurements; that their studies were improperly controlled; that they'd ignored contradictory research; and that they'd based their research on unproven assumptions. Unfortunately it's nearly hopeless for a layperson to try and evaluate either the claims or the criticism; one quickly discovers that the statistics involved are extremely complex.

About a year after The Bell Curve was published and the charges of racism had been thoroughly aired, the American Psychological Association decided to write its own report to specifically address the book's findings. A diverse task force of American psychology professors was assembled to "identify, examine and summarize relevant research on intelligence." Of the difference between blacks and whites, the APA confirmed that there has long been about a 15-point difference, which is one standard deviation; but it also found that there is no clear reason for this, and there is certainly not sufficient evidence to point to a genetic cause. Society is very complicated, and many factors appear to affect intelligence. Some of these suspected factors, most of which are unproven, include nutrition, education, English skills, experience with testing, and heritability.

The APA's final conclusion was critical not just of The Bell Curve, but of the debate in general:

In a field where so many issues are unresolved and so many questions unanswered, the confident tone that has characterized most of the debate on these topics is clearly out of place. The study of intelligence does not need politicized assertions and recriminations; it needs self-restraint, reflection, and a great deal more research. The questions that remain are socially as well as scientifically important. There is no reason to think them unanswerable, but finding the answers will require a shared and sustained effort as well as the commitment of substantial scientific resources. Just such a commitment is what we strongly recommend.

Two of these unanswered questions stand out as particularly intriguing: The racial differences, and something called the Flynn effect, and it may turn out that they're related. New Zealand political scientist Jim Flynn first noted that every time intelligence tests have been revised, average scores worldwide have gone way up, by about a standard deviation; and it's been necessary to reset 100 to a higher and higher point. People have been getting more intelligent ever since testing began, and some believe this improvement is accelerating. The reasons for the Flynn effect are unknown, but hypotheses usually center around the nurture factors for intelligence such as an increasingly intensive academic environment and healthcare. The Flynn effect is proven to change scores by at least as much as the racial differences that have been found, and it's possible (though far from evidenced) that unequal distribution of the same intelligence nurturing resources responsible for the Flynn effect may be responsible for the racial differences.

And so, while the roots of IQ testing came from the inherently negative process of identifying and culling out the worst of humanity, its future may prove to be crucial in helping everyone develop to a higher potential. Eugenics is one of those shameful follies that can't be uninvented, but its lessons may not have been entirely without fruit. When Binet and Simon first set out to learn how to find the schoolchildren who needed special help, they may have been onto something with far broader application. Theirs was not the spirit of culling, but the spirit of helping; and intelligence testing will always be linked to both.

