Cultural Test Bias As An Explanation

The bias explanation of score differences has led to the cultural test bias hypothesis (CTBH; Brown et al., 1999; Reynolds, 1982a, 1982b; Reynolds & Brown, 1984b). According to the CTBH, differences in mean performance for members of different ethnic groups do not reflect real differences among groups but are artifacts of tests or of the measurement process. This approach holds that ability tests contain systematic error occurring as a function of group membership or other nominal variables that should be irrelevant. That is, people who should obtain equal scores obtain unequal ones because of their ethnicities, genders, socioeconomic levels, and the like.

For SES, Eells, Davis, Havighurst, Herrick, and Tyler (1951) summarized the logic of the CTBH as follows: If (a) children of different SES levels have experiences of different kinds and with different types of material, and if (b) intelligence tests contain a disproportionate amount of material drawn from cultural experiences most familiar to high-SES children, then (c) high-SES children should have higher IQ scores than low-SES children. As Eells et al. observed, this argument tends to imply that IQ differences are artifacts that depend on item content and "do not reflect accurately any important underlying ability" (p. 4) in the individual.

Since the 1960s, the CTBH explanation has stimulated numerous studies, which in turn have largely refuted the explanation. Lengthy reviews are now available (e.g., Jensen, 1980; Reynolds, 1995, 1998a; Reynolds & Brown, 1984b). This literature suggests that tests whose development, standardization, and reliability are sound and well documented are not biased against native-born, American racial or ethnic minorities. Studies do occasionally indicate bias, but it is usually small, and it most often favors minorities.

Results cited to support content bias indicate that item biases account for < 1% to about 5% of variation in test scores. In addition, it is usually counterbalanced across groups. That is, when bias against an ethnic group occurs, comparable bias favoring that group occurs also and cancels it out. When apparent bias is counterbalanced, it may be random rather than systematic, and therefore not bias after all. Item or subtest refinements, as well, frequently reduce and counterbalance bias that is present.

No one explanation is likely to account for test score differences in their entirety. A contemporary approach to statistics, in which effects of zero are rare or even nonexistent, suggests that tests, test settings, and nontest factors may all contribute to group differences (see also Bouchard & Segal, 1985; Flynn, 1991; Loehlin, Lindzey, & Spuhler, 1975).

Some authors, most notably Mercer (1979; see also Lonner, 1985; Helms, 1992), have reframed the test bias hypothesis over time. Mercer argued that the lower scores of ethnic minorities on aptitude tests can be traced to the anglo-centrism, or adherence to White, middle-class value systems, of these tests. Mercer's assessment system, the System of Multicultural Pluralistic Assessment (SOMPA), effectively equated ethnic minorities' intelligence scores by applying complex demographic corrections. The SOMPA was popular for several years. It is used less commonly today because of its conceptual and statistical limitations (Reynolds, Lowe, et al., 1999). Helms's position receives attention below (Helms and Cultural Equivalence).

