A measure is construct-valid to the extent that it measures the attribute (construct, factor) it is sup posed to measure (Cronbach & Meehl, 1955). Construct validity implies reliability but reliability does not guarantee construct validity (Thurstone, 1937). Reliability means that a measure reflects a systematic factor, whereas construct validity means that it reflects the systematic factor we want to assess. Construct validity is thus directly related to multidetermination. If several causes affect the results obtained with a method, the method measures each cause but none with perfect validity. The methods in our example were not perfectly valid measures of altruistic personality because individual differences depended on other causes as well. As a consequence of multidetermination, methods have several validities (e.g., achievement tests measure ability with a certain validity but also achievement motivation with a certain validity). If the test was made for measuring ability, its (primary) validity as an ability measure should be much higher than its (secondary) validity as an achievement motivation measure.
Depending on the measurement purpose, the same factor can either be diagnostically relevant or irrelevant. In our helping example, the approval motive is diagnostically irrelevant and reduces the construct validity of self-reported help as a measure of helpfulness. By contrast, the approval motive is diagnostically relevant if we want to use self-reported help as a social desirability measure. In this case, helpfulness becomes diagnostically irrelevant and reduces the construct validity of our social desirability measure. Assuming that helpfulness is a stronger factor of self-reported help than is the approval motive, the example shows that the primary validity of a method is sometimes lower than its secondary validity.
Diagnostically irrelevant factors of assessment methods can be method-specific (nonshared) or common (shared). Although both types of factors reduce the construct validity of an assessment method, they have different implications for convergence. Whereas method-specific factors reduce convergence among methods, common method factors increase convergence (Hoyt, 2000). Consider our helpfulness example. If I asked my neighbor and his wife whether my plants had been watered, both answers will probably measure true helpfulness. In addition, however, both answers might reflect social desirability as a second common factor. My neighbor might exaggerate his help as a result of his approval motive as well as his wife. She may hope that my approval and gratefulness will be directed not only to her husband but also to her. Both factors, helpfulness and social desirability, are common factors here and contribute to convergence. However, social desirability as a diagnosti-cally irrelevant common factor reduces the construct validity of both measures. The example demonstrates that convergence among methods is an insufficient criterion of construct validity (Campbell & Fiske, 1959). Convergence across methods reflects their construct validity only if they are heterogeneous in the sense that they only share the diagnostically relevant factor (Houts et al., 1986). Defining heterogeneity in practice is a challenge, however, because separating the diagnostically relevant sources of variance from the irrelevant sources requires what we seek: valid measures. This explains why choice of method is a matter of theory (Fiske, 1987b).
Was this article helpful?