Most social psychologists agree that a multimethod perspective in conducting research is useful if not necessary. One reason for this agreement is that we can all quickly think of topic areas in which progress was held back initially by an inadequate application of multimethods—but in which progress was also advanced by a more effective application in the long run. Research on social facilitation (e.g., Guerin, 1986; Triplett, 1898; Zajonc, 1965) and group polarization (e.g., Myers & Lamm, 1976; Stoner, 1961) are classic, well-worn examples. In the case of the group polarization, the ini tial research appeared to show that group decisions tended to be riskier than individuals decisions (Stoner, 1961). However, this "risky shift" turned out to be dependent on the nature of the decision, and thus was an artifact of insufficient stimulus sampling (e.g., Burnstein & Vinoker, 1975; Myers & Lamm, 1976). Research using a representative range of decision types showed that groups actually tend to polarize decision making, hence the revised label of "group polarization." Both the social facilitation research and the group polarization research are part of any social psychologist's collective memory of instances in which oversights, failures, or glitches in our use of multimethods slowed progress more than necessary.
Recommendations for the field. If most social psychologists agree that a multimethod perspective in conducting research is useful if not necessary, the question then becomes, Why is this perspective not adopted more frequently in the studies we publish? Perhaps the most honest answer to that question is that a multimethod approach, particularly one that involves all the levels of multimethod analysis that we describe here, is quite simply a lot of work. Also, given the publish-or-perish pressures of the tenure track, the vague disrespect often given to mere replications, and the seemingly ever-expanding number of studies required per manuscript to get accepted at top journals, the temptation is great indeed to stick with a measure and/or manipulation that you know works.
The ultimate consequence of such pressures is that our discipline is confronted with a social trap: It is in most researchers' individual best interest to get more research done more quickly by using a small number of previously validated procedures and without bothering to replicate findings with other procedures or samples; yet when everybody does so, the knowledge base of our field suffers. To put it another way, until a multimethod approach to conducting research becomes either normative or required in our discipline, the costs of such approaches in terms of time, reduced productivity, and the risk of inconsistent results generally outweigh the perceived benefits to validity and theory that accrue. We thus pay lip service to multimethod approaches in much the same way we do the necessity for cross-cultural replication: Sounds great, and somebody needs to do it, but just not me.
In Garrett Hardin's classic article, "The Tragedy of the Commons," he noted that appeals to better behavior rarely work to solve social traps and that instead what is needed is "mutual coercion, mutually agreed upon" (1968, p. 1247). Should our discipline arrive at the consensus that demonstrating construct validity through multimethod research is important, measures could in principle be taken to ensure that it is adopted more frequently. Precedence for such actions has been established before, as, for example, seen in the Task Force on Statistical Inference (Wilkinson, 1999). For many years, writers had been decrying the single-minded pursuit of p levels and neglect of effect sizes, to little or no effect (Cohen, 1994; Harris, 1991; Meehl, 1978). But after the American Psychological Association (APA) convened the Task Force and changed APA style in response to its recommendations to mandate the reporting of effect sizes to accompany each focused test of significance, such reporting of effect sizes is now routine. Although APA policy strictly speaking applies only to journals published by APA, most other psychological journals would follow suit.
A less heavy-handed solution our field could take is to institute norms for using multimethod approaches in a multitiered fashion through changes in editorial policy among the premier journals of our discipline. It would require only a small shift in editorial policy to request that follow-up studies within a manuscript show convergence across operationalizations of the independent and dependent variables. Change in policy could occur on a grassroots basis as well, if manuscript reviewers started including multimethod convergence as one of the criteria they evaluate before recommending acceptance in a top-tier journal.
A third course of action our field can take is to improve the education of our graduate students. In many PhD programs in social psychology, the only coverage of multimethod issues is the assignment of Campbell and Fiske's (1959) classic article. And in general, procedures for demonstrating validity often receive short shrift compared to the attention paid to reliability. This is probably because of the fact that assessing reliability is a relatively cut-and-dried matter—you run your test-retest rs or coefficient alphas—but there is no single standard procedure for assessing validity. Although this makes teaching validity and multimethod issues fuzzier, it is not less important than reliability and should be given equal weight in our training.
Recommendations for individual researchers. We are pragmatic enough to realize that few, if any, of the preceding recommendations will ever actually be adopted within our discipline. What advice, then, can we offer to individual researchers who wish to enhance their use of multimethod approaches? First, we suggest that researchers do a better job of explicitly acknowledging the multi-method convergence of their findings when writing manuscripts. The more we acknowledge and appreciate multimethod convergence when it happens, the more we will notice it when it is not there.
Second, researchers should acquaint themselves with, and take advantage, of the wide variety of statistical approaches available for demonstrating multimethod agreement, such as simple correlational analyses, confirmatory factor analysis or structural equation modeling (e.g., Cunningham et al., 2001; Kiers, Takane, & ten-Berge, 1996; Koeltinger, 1991; Millsap, 1995a; Schmitt & Stults, 1986; see also Eid, Lischetzke, & Nussbeck, chap. 20, this volume), multilevel modeling (e.g., Livert, Rindskoph, Saxe, & Stirratt, 2001), treating stimuli as random factors (e.g., Kenny, 1995) in the case of analyzing within-method replication, and various computations available for assessing the success of between-method replications (Rosenthal, 1990). Even a simple correlation between two measures differing in sources of method variance can go a very long way in demonstrating multimethod agreement, especially if the alternative is relying on a single measure. Another simple approach to quantifying construct validity has been recently proposed by Westen and Rosenthal (2003), who introduced two straightforward metrices for gauging the extent of agreement between hypothesized and obtained patterns of intercorrelations.
Third, as mentioned earlier, researchers in their role as manuscript reviewers can encourage the use of multimethod approaches in others' work by noting reliance on single-source measures or inadequate attention to stimulus sampling as a major limitation to a manuscript, perhaps even precluding publication, and by lauding evidence of attention to multimethod issues as a strength. In this vein, reviewers could help by not insisting on perfect consistency or significant results across all measures. Researchers will be more hesitant to try or report novel methods if they believe an inconsistent result would doom their publication chances. Also, as we have discussed earlier, failures to replicate across dependent measures, independent variable manipulations, or population type and culture can advance understanding by pointing out limiting conditions as much as replication can.
Researchers can and should also take care not to become paradigm bound. A measure or procedure that works and that all your buddies working in this area use is indeed convenient. But sooner or later you will have learned all that you can with this approach, or at the least you will miss out on what you could have learned with another approach. Social psychologists can benefit from being more familiar with research done by developmental psychologists, who tend to use multiple methods more frequently. Reading the work done by methodologically inventive researchers in personality and clinical psychology can also be very enlightening. Researchers working in other fields have developed techniques and measures that could, with minor fiddling, be put to good use in one's own field. The IAT is a good example of this, as it was originally designed to serve as an implicit measure of racial prejudice. However, it has quickly caught on in other areas of psychology and has been used to measure anything from clinical phobias (Teachman, Gregg, & Woody, 2001) to self-esteem (Greenwald & Farnham, 2000).
Finally, researchers should be cognizant that some forms of multimethod validation are more necessary than others, and we should design and evaluate studies accordingly. Stimulus sampling concerns will be less worrisome when the critical independent variable, say race, is manipulated as one word or phrase in two otherwise identical stimulus paragraphs than they would be when only one black and one white confederate are used. If the black confederate just happened to have an unpleasant personality, for example, a more negative reaction to him or her could mean many other things besides racism. Similarly, relying on a single self-report dependent variable may be less troublesome if it involves a relatively clear-cut topic that is not likely to be prone to distortion as a result of social desirability Lastly, as Mook (1983) so eloquently pointed out, there is a time and place for external invalidity. In some cases, for example, the initial stages of a program of research when one is happy simply to show that a given result is theoretically possible, demonstrating the result with a single measure or operationalization can be truly informative in and of itself.
Was this article helpful?