Validity, one of the key issues of research, concerns the question whether the inferences drawn from the results of a study are true or not (Shadish, Cook, & Campbell, 2002). In particular, with respect to measurement methods, validity represents the degree to which the adequacy and appropriateness of inferences and actions based on the results of a measurement device are supported by empirical evidence and theoretical rationales (Messick, 1989). Multimethod research plays a key role in the validation process. In their groundbreaking article, "Convergent and discriminant validation by the multitrait-multimethod matrix," Campbell and

'In this handbook the term multilevel analyses will also be used for statistical methods for analyzing nested data (e.g., students nested within classes, measurement occasions nested within individuals, etc). There are strong differences between multilevel analyses as a research program for measuring different determinants of behavior and multilevel analysis as statistical method. However, the appropriate meaning will be clearly determined by the context.

Fiske (1959) described the cornerstones of a multi-trait-multimethod research program regarding the validation process. The basic promises of the multi-trait-multimethod approach have strongly influenced the process of exploring validity. First, Campbell and Fiske pointed out that several methods are needed to appropriately analyze validity, and these different methods should converge in the measurement of the same trait. The convergence of different independent methods indicates convergent validity. Second, they convincingly argued that discriminant validity must be shown before introducing a new construct into science. Third, Campbell and Fiske clarified that a score on a psychological variable not only reflects the psychological construct under consideration, but also reflects systematic method-specific influences. Fourth, they demonstrated the necessity of including at least two different methods in psychological studies to separate trait from method influences. Hence, for a complete understanding of psychological processes it is necessary to apply a multimethod research strategy. Therefore, the multitrait-multimethod analysis has become an essential strategy for proving the construct validity of psychological measures.

Convergent validity is a core aspect of validity, and validation research programs have been focused for a long time on seeking high convergent validity coefficients. Although high validity coefficients are desirable many reasons explain why convergent validity coefficients are often lower than hoped. For example, if one compares physiological measures with other measures one must contend with individual response-uniqueness (e.g., Berntson & Cacioppo, 2004). Not all individuals react to stimulus in the same way, and this response specificity can lower convergence when measured with a correlation coefficient. Moreover, if one wants to compare a self-rating with a peer-rating, one often uncovers medium-sized correlation coefficients. In comparing self- and other-ratings one must recognize rater biases (Hoyt, 2000). Raters may not only interpret scale items differently but might also have opportunities to observe different behavior, they might use different indicators of behavior, and they might link the indicators to the response scale in a different way (Hoyt, 2000; Kenny, 1991). Moreover, leniency or severity errors and halo effects can affect peer ratings, and peer- as well as self-rating might also be distorted by social desirability effects (Neyer, this volume, chap. 4). All these forms of bias and distortion can cause small convergent validity coefficients. Therefore, Westen and Rosenthal (2003) recommend quantifying construct validity by comparing the observed patterns of correlations with the theoretically expected patterns of correlations. They contend that if a good theoretical reason for expecting lower correlations between multiple measures exists, and this pattern of correlations can be empirically confirmed, modest degrees of convergence can confirm construct validity.

High convergent validity is not always the goal of research. Take, for example, a questionnaire measuring different facets of marital satisfaction. Spouses rate their own satisfaction and also their perception of the satisfaction of their spouse. If the aim of the test construction process was to develop a questionnaire that detects deficiencies in intraspouse perception and communication processes, the items with the lowest convergences might be the most interesting. In other words, method influences are not inevitably unwanted random disturbances (e.g., measurement error) but they can indicate valid and valuable information. A deeper understanding of method influences can enlarge our knowledge of the construct under consideration, and this knowledge, in turn, can help explain method effects, correct for method effects, and plan and conduct studies in which method effects are minimized or—depending on the aim of the study considered—maximized. Beyond the traditional search for maximum convergent validity, a thorough analysis of method influences might tell a more interesting story of the construct under consideration. Hence, a multimethod study should always have two facets: first, the proof of convergent validity on the basis of theoretical expectations, and second, the analysis of the nature of method-specific influences. Whereas multimethod studies intend to meet the first goal, the second goal is often not considered when planning the study's design. A careful analysis of method effects requires the inclusion of variables that may explain method influences, and that might suppress method-specific effects to enhance convergent validity. This makes a thorough knowledge of measurement methods necessary for all researchers.

